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ABSTRACT 


Without the protection of atmosphere, space systems have to mitigate radiation ef¬ 
fects. Several different technologies are used to deal with different radiation effects in 
order to keep the space device work properly. One of the radiation effects called Single 
Event Upset (SEU) can change the state of a component or data on the bus. A single er¬ 
ror is possible to cause a system failure if it is not corrected. 

Besides error correction, a space system also needs the flexibility to be modified 
or upgraded easily. Consequently, the idea of having a TMR design instantiated in an 
FPGA to construct a Configurable Fault-Tolerant Processor (CFTP) developed. The 
TMR, which runs one program in three identical soft-core processors with voters, is a 
scheme used to mitigate an SEU. The full design of TMR running in an FPGA functions 
as a System-On-a-Chip (SOC). Both soft-core processor and FPGA offer the CFTP a 
great flexibility to be reconfigured. 

A complete TMR design includes some fundamental components besides proces¬ 
sors and voters such as the Reconiler, Interrupt, and Error Syndrome Storage Device 
(ESSD). These components have their unique function in the TMR design. They are cre¬ 
ated and simulated. Factors that affect test bench-settings like processor pipelining are 
important to always keep in mind. A component is designed to implement proper func¬ 
tions first. Then it is revised to work with the processor and memory. The full design for 
the TMR in this thesis proves its ability to detect and correct an SEU. The follow-on re¬ 
search suggested is to improve the efficiency and performance of this design. 
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EXECUTIVE SUMMARY 


Space systems suffer radiation effects in space. These radiation effects occur ran¬ 
domly and are hard to predict. The combination of effects can destroy a system or make 
it functionless. Therefore, different methods are presented to protect space devices such 
as radiation hardened or fault tolerant systems. Space systems are usually tested and 
simulated several times before launching in order to minimize the probability of losing 
control of it after launch. 

The Single Event Upset (SEU) is a radiation effect which causes a bit flipping in a 
device. This effect is not strong enough to destory a system but may cause a series of er¬ 
rors that finally make the system unusable. This error should be corrected in time and 
Triple Modular Redundancy (TMR) is one of the schemes to mitigate this problem. 

The TMR design selected for the CFTP is to instantiate three soft-core processors 
with some other components into a fault tolerant Field Programmable Gate Array 
(FPGA). The FPGA is easily reconfigured and the soft-core processor has great flexibil¬ 
ity to be programmed or modified. Those features give a TMR design the ability to be 
maintained and upgraded. The processor chosen for TMR design is a 16-bit Reduced In¬ 
struction Set Computer (RISC) processor named KDLX. It is a 5-stage pipelined proces¬ 
sor with Harvard architecture. The pipeline affects the settings of a test bench and the in¬ 
fluence is discussed in this thesis. A full simulation for all instructions is introduced to 
help understand functions of different operation codes. 

To stop an error being propagated, the TMR has to correct the error once it is de¬ 
tected. Three processors in TMR should always execute the same instruction and all ac¬ 
tions should be identical. Any inconsistency found among these three processors will be 
considered as an error. Then the TMR needs to have a function to stall the current opera¬ 
tion and correct errors in processors. For error detection and correction, the following 
four major components are designed: majority bit voter, Reconciler, Interrupt, and Error- 
Syndrome Storage Device (ESSD). 
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Voters are connected at output pins or buses of processors. Therefore all output 
signals are voted. The majority bit voter takes two out of three identical signals as the 
output signal and reports the occurrence of an error if one of the three is different. The 
voter is able to correct an error immediately and indicate where the error is. Construction 
of three processors with voters called the TMR Assembly. 

Due to different architectures between the processor and memory, a Reconciler is 
responsible for coordinating the difference between these two architectures. The solution 
is to run memory twice as fast as the processor and let the Reconciler route data of mem¬ 
ory. The memory acts as an instruction memory at the first half of processor clock cycle 
and acts as a data memory at the other half cycle. Thus, the processor thinks it is con¬ 
nected with two different memories. The Reconciler in TMR for this thesis is purely a 
reconciler and does nothing directly related with error detection or correction. This pu¬ 
rity makes it independent of other components. 

When an error is detected by voters, the Interrupt starts the Interrupt Service Rou¬ 
tine (ISR). In order to store and read properly, this component has to run as fast as the 
Reconciler. The Interrupt replaces the current instruction on the bus with a TRAP in¬ 
struction when an error occurs. This TRAP instruction will be fetched by all processors 
and executed. The ISR is a special program designed to correct inconsistency of contents 
in registers between three processors. At the end of ISR, the Interrupt injects a Jump in¬ 
struction into instruction bus and leads processors back to the normal operation. 

The ESSD latches some specific data from the buses when an error occurs. These 
specific data are called the error syndrome, which is unique for one specific error. Error 
syndromes are very useful for health checking or error debugging to a system. In order to 
latch data at the correct timing, the ESSD has to run as fast as the Reconciler (or Inter¬ 
rupt). The ESSD does not pass its data to the Reconciler when storing. Instead, it takes 
over the whole memory and saves error syndromes while the processors are deliberately 
stalled. 

The full design consolidates all components to construct a complete TMR design. 
The design was simulated and its function was proved in this thesis. This premiere de- 
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sign gives a big picture of how errors are detected and corrected. Furthermore, interac¬ 
tion between different components is one of the important concepts to leam. The full de¬ 
sign has four different clocks. The Reconciler, Interrupt and ESSD are using the same 
clock speed since none of them needs the signal from another. The other three clocks are 
KDLX clock, memory clock and one special clock for the latch. 

For further research, extra circuits or components are needed to improve the abil¬ 
ity of error correction on different components. Considering an error generated in the 
Reconciler, the error may never be found and data stored to memory is always wrong. 
Reinforcing reliability of some components is something that needs to be considered. 

The current design may be modified to meet the requirements of advanced functions. Fi¬ 
nally, searching for a better processor to enhance the perfonnance is required as well. 
Commercial processors usually come with a software package and have better customer 
support. OpenCores that people share to the public are free but a user needs to have 
backgrounds of coding in order to realize the core. 


xxi 
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I. INTRODUCTION 


An electronic device in space environment suffers an extreme challenge to its re¬ 
liability due to the lack of atmosphere and huge temperature variation. Without protec¬ 
tion of atmosphere, a space system is exposed in a very unique circumstance which con¬ 
tains cosmic rays (85% protons, 14% alpha particles and 1% heavy Nuclie), solar events 
(X-rays, heavy ions and protons) and trapped radiation (electrons and protons trapped in 
magnetic field of earth, called Van Allen Belt). Thus, radiation effects on a space elec¬ 
tronic system become one of the most important issues that need to be solved. Those ef¬ 
fects include Total Dose Effects and Single Event Effects. 

A number of methods have been presented to mitigate radiation effects. Using 
soft-core Triple Modular Redundancy (TMR) on a Field Programmable Gate Array 
(FPGA) provides a practical solution to Single Event Effects which is low cost and offers 
flexibility to be reconfigured and easily developed. The Configurable Fault-Tolerant 
Processor (CFTP) is a system based on this concept utilizing Commercial-Off-the-Shelf 
(COTS) technology and features of TMR soft-core microprocessors on FPGAs as a Sys- 
tem-On-a-Chip (SOC). 

A. RADIATION EFFECTS 

Radiation effects on a space system vary depending on different altitude, location 
and solar events. For example, the inner Van Allen Belt, from 650 km to 6300 km above 
Earth’s surface, is composed mostly of protons about 10 to 15 MeV (1 MeV = 10 6 eV, 

1 electronvolt ~ 1.6xl0" 19 J). As a satellite travels in Low-Earth Orbit (LEO), from 160 to 
6000 km, it will have many chances to be affected by protons. The scheme used to solve 
radiation problems on this satellite must be different from the one that travels in geosta¬ 
tionary orbit, whose altitude is 35,780 km. Since a satellite in geostationary orbit has al¬ 
most no protection by Earth, it needs to be more radiation-hardened (RADHARD) or ra¬ 
diation-tolerant. Major effects caused by radiation are Total Dose Effects and Single 
Event Effects (SEE) including Single Event Phenomenon (SEP), Single Event Upset 
(SEU), Single Event Latchup (SEL) and Single Event Burnout (SEB) [1]. 
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1 . 


Total Dose Effects 


Total Dose Effects refer to total radioactive particles that a device accumulates 
over its lifetime. This accumulation degrades the performance until the device becomes 
totally useless. The general solution to mitigate these effects so far is using radiation¬ 
hardening or shielding techniques, but such methods can only extend the end of life of the 
chip, not totally eliminate this problem. 

2. Single Event Phenomenon (SEP) 

Single Event Phenomenon is the situation where a transistor resets to its original 
state due to the particle passing through. This causes unpredictable results and may or 
may not affect operation of a system. 

3. Single Event Upset (SEU) 

Single Event Upset is a logical bit changing because of the radiation. A bit flip¬ 
ping may cause a chain reaction and consequently result in an unrecoverable error of a 
system. TMR is a mitigation scheme using three identical processors to run a same in¬ 
struction set and voting all results to detect and correct such an error. 

4. Single Event Latchup (SEL) and Single Event Burnout (SEB) 

Single Event Latchup occurs when a parasitic transistor is formed by a spurious 
current spike like heavy cosmic ray [2], This puts a circuit into a high-operating-current 
mode that has to be cleared by power off-on reset. Hard errors can drag the bus voltage 
down or even burn out the circuit. This is called Single Event Burnout. 


Some techniques used to mitigate radiation effects are shown in Table 1. 


Radiation Effects 

Mitigation Techniques 

Total Dose 

Radiation-Hardening 

Silicon-On-Sapphire 

Silicon-On-Insulator 

Thin-Gate-Oxide 

Shielding 

Single Event Latchup (SEL) 

Radiation Hardening 

Guard Rings 

Single Event Upset (SEU) 

Quadded Logic 

Software Fault Tolerance 

Tripple Modular Redundancy 


Table 1. Radiation Effects and Mitigation (From Ref. [1].) 
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B. FIELD PROGRAMMABLE GATE ARRAY (FPGA) 


Sequential programmable devices are composed of gates and flip-flops and are 
able to perform a variety of functions. Three major types of sequential programmable 
devices are the Sequential (or simple) Programmable Logic Device (SPLD), the Complex 
Programmable Logic Device (CPLD) and the Field Programmable Gate Array (FPGA). 

A SPLD which integrates the AND-OR array and flip-flops is the smallest and the cheap¬ 
est form of programmable logic. A CPLD is similar to a SPLD except that it is a collec¬ 
tion of individual PLDs. Interconnections between PLDs are programmable as well. A 
typical CPLD is equal to 2 to 64 SPLDs. An FPGA consists of logic cells surrounded by 
a ring of programmable I/O blocks. Each cell is able to implement a logic function which 
is done by programming and all interconnections between cells are also programmable. 



Figure 1. Composition of FPGA (From Ref. [3].) 


Unlike the FPGA, PLDs need to be physically removed from a system and repro¬ 
grammed by specific methods. This disadvantage makes a space system made of these 
devices almost impossible to be modified or upgraded. Programmed circuits can be eas¬ 
ily instantiated on a FPGA without any specific requirements. This feature reduces time- 
to-market of a product as well. Comparing with other device, FPGAs are less power con¬ 
suming, less expensive, have large-scale advantages of programmable logic and high 
flexibility [4], 

The FPGA selected for CFTP is the Virtex XCV800, a member in Virtex FPGA 
family of Xilinxi. Table 2 shows the specification of some Virtex family members. A 

1 Xilinx is a registered trademark of Xilinx Corporation. 
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CLB is a Configuration Logic Block which can be configured to represent any 4-input 
switching function to define a design. CLBs are also connected to each other by pro¬ 
gramming as part of the design process. A design can be parsed to multiple CLBs for full 
implementation if it is too large to fit into a single CLB [5]. 


Device 

System Gates 

CLB Array 

Logic Cells 

Maximum 
Available I/O 

Block RAM 
Bits 

Maximum 
SelectRAM+™ Bits 

XCV50 

57.906 

16x24 

1,728 

180 

32,768 

24,576 

XCV100 

108,904 

20x30 

2,700 

180 

40,960 

38,400 

XCV150 

164,674 

24x36 

3,888 

260 

49,152 

55,296 

XCV200 

236,666 

28x42 

5,292 

284 

57,344 

75,264 

XCV300 

322,970 

32x48 

6,912 

316 

65,536 

98,304 

XCV400 

468,252 

40X60 

10,800 

404 

81,920 

153,600 

XCV600 

661,111 

48x72 

15,552 

512 

98,304 

221,184 

XCV800 

888,439 

56x84 

21,168 

512 

114,688 

301,056 

XCV1000 

1,124,022 

64x96 

27.648 

512 

131,072 

393.216 


Table 2. Virtex FPGA family members (From Ref. [6].) 


One of the reasons for choosing this FPGA was because its pin configuration is a 
flat-pack. This type of interface is spaceflight certified and has been used in space for 
years. Some of the newest and largest FPGAs nowadays are using ball grid array (BGA) 
connections which are not only difficult to be attached to a printed circuit board, but also 
not qualified for space applications [5]. 

C. SOFT-CORE PROCESSORS 

A soft-core processor is a set of source codes expressed in hardware description 
language (HDL) which express the behavior of a real processor. It is a synthesizable 
HDL design and has no explicit hardware realization. This type provides great flexibility 
but has limitation of perfonnance and predictability. A hard-core processor, on the other 
hand, provides high performance but is not flexible. 

Since a soft-core processor can be easily instantiated on a FPGA, a designer has a 
wide range of selections and combinations. A soft core can be optimized for different 
FPGA sizes and characteristics to improve performance, giving the most cost-efficient 
solution for target applications. A hard core which has specific function blocks needs to 
work with special FPGA device. The need for these specific FPGAs is limited; therefore 
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they do not have the large-scale manufacturing benefits which forces vendors to support 
few FPGA packages. Another disadvantage of using a hard core is if a problem is found 
in one version, all specific FPGAs supporting that version have to be revised. Hard cores 
are good for big and commonly used functions like a RAM [4]. 

The soft-core processor chosen for this iteration of the CFTP is a 16-bit Reduced 
Instruction Set (RISC) KDLX processor. The DLX processor is coded in HDL and de¬ 
scribed in Hennessy and Patterson’s Computer Architecture: A Quantitative Approach 

[7] . The KDLX processor is a revision of DLX processor by Dr. Kenneth Clark that was 
used on complex digital systems to predict SEU tolerance as described in his dissertation 

[8] . Therefore, one of the reasons to use this processor is that it had been designed and 
tested. 

D. TRIPLE MODULAR REDUNDANCY (TMR) 

Once a system is launched to space, it is hard and expensive to maintain it. In or¬ 
der to correct errors caused by radiation, different ways have been presented and actually 
used in space. Using RADHARD devices or fault-tolerant designs are the most common 
ways. TMR is one of the solutions to make a circuit be able to tolerate occurrence of an 
error and correct it. This is done by software so it is simple and low-cost. Taking advan¬ 
tage of the FPGA, the TMR instantiated inside becomes easily modified and upgraded in 
the future. 

Basically, a TMR system is composed of three identical devices and voting logic 
as shown in Figure 2. The voting logic is a majority voter which takes the majority of the 
inputs to be the output value. Since Devices B and C are replication of Device A and they 
all accept the same input value, the outputs of A, B and C should be consistent in theory. 
Due to radiation effects in space, one of these three devices may have an error inside and 
generate a different output. This inconsistency will be caught and corrected by voting 
logic. Thus, the voted output is always a correct value under the assumption of a single 
error. 
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Input 


Device A 


Device B 


Device C 


Output A 


Output B 


Output C 



Voted Output 
Error Corrected Signal 


Figure 2. Basic TMR Concept (After Ref. [1].) 


When the TMR concept is applied to a microprocessor, it is illustrated in Figure 3. 
All output signals of the CPU are voted; therefore no error should exist at outputs of vot¬ 
ers. Any error that occurs represents that one of the CPUs has an error inside. If that er¬ 
ror is not corrected by some way, it may result in more errors and finally become unre¬ 
coverable. Thus, the Error Encoder in Figure 3 is a device that will analyze error signals 
offered by voters and find out which CPU generates the error. Once the faulty CPU is 
identified, some extra circuits will interrupt all three processors and correct that error. 
When a simple circuit acting as a system is instantiated on a chip (e.g., FPGA), it is 
called a system on a chip (SOC). Recall that a soft core is not efficient for complex func¬ 
tions; therefore the memory block in Figure 3 is an external chip. 


Common 

Inputs 



To output 
Interface 


Figure 3. Microprocessor TMR Concept 


The CFTP implements these basic ideas. The circuits to do interruption and cor¬ 
rect an error are quite complicated. All concepts for constructing a complete TMR de¬ 
sign will be explained in the rest of chapters. 
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E. ORGANIZATION 

Chapter II reviews previous theses and gives other infonnation related to the 
CFTP. Chapter III describes the testing environment and introduces the software used in 
the thesis. Chapter IV discusses the function and features of the KDLX. Simulations of 
all instructions for the KDLX are shown in this chapter. Chapter V goes over the design 
of voter logic in previos theses then constructs the TMR Assembly and simulates it. 
Chapter VI describes the Reconciler used to coordinate different architectures in this de¬ 
sign. Chapter VII is a description of the Interrupt module designed for correcting errors 
in the registers. Chapter VIII shows the simulation of the full design without any cir¬ 
cuitry to handle the reporting of errors. This chapter explains the function of the ISR and 
how different components work together. Chapter IX introduces the component used to 
store necessary data for future analysis when an error occurs. This component is Error 
Syndrome Storage Device and its function of the full design is verified in this chapter. 
Chapter X contains conclusions and topics for follow-on research. 

F. ADDITIONAL DOCUMENTATION 

Appendix A contaions all schematics, test benches, and simulation results dis¬ 
cussed in this thesis. Some the the figures are zoomed in to provide better views of the 
small numbers on the buses. Appendix B is the description of the whole instruction set 
for the KDLX. Appendix C contains VHDL codes for all components designed in this 
thesis. The VHDL files for the KDLX processor are also included. 

G. CHAPTER SUMMARY 

This chapter has given fundamental understanding of radiation effects, FPGA and 
soft-core processors. The general concept of a TMR design has been introduced as well. 
Previous thesis work of CFTP will be reviewed in next chapter and the TMR technique 
for correcting an error will also be described. Reading old thesis work is always a good 
starting point of learning. Experience will be shared and direction for following research 
will be pointed out. 
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II. TMR REVIEW IN PREVIOUS WORK 


To construct a CFTP design is a really complex work and needs a significant 
amount of time to finish. In order to have a flawless design, lots of conditions need to be 
considered and all problems should be solved in a reasonable way. Selecting components 
may take few days or months depending on how much data or information is collected. 
Decisions may still be changed at the last minute due to some unpredictable situations or 
inevitable factors. Any change in the final design on a component sometimes will cause 
a series of modifications to others. It is obvious that building a fully-functional CFTP 
does take much effort and designers have to really understand how circuits relate each 
other in order to revise or debug it. Unfortunately, graduate students at Naval Postgradu¬ 
ate School only stay a short amount of time. A big design like CFTP is chopped into 
several segments and assigned to different students. In this time constraints, students not 
only need to realize what previous students have done but also take up a design in pro¬ 
gress. Most of the time, students picking up the segments do not have a chance to learn 
directly from students who have worked on this design before. Thus, the thesis becomes 
an important interface of experience inheritance between generations of students. 

A. LASHOMB’S DESIGN 

Peter A. LaShomb [1] expressed many concepts in both TMR design and FPGA 
selection. Traditional solutions for radiation effects were introduced including hardware 
redundancy, like Quadded Logic, and software improvement for fault tolerance, like time 
redundancy or software redundancy. In the TMR section, RAD HARD and COTS were 
compared in availability, performance and cost. Potential benefits of those two were 
clearly described as well. The processor used in his TMR design was KCPSM, an 8-bit 
microcontroller. It was free downloaded from Xilinx’s website and served as a readily 
available test-case processor while waiting availability of other high performance proces¬ 
sors. Constructing and testing of the TMR were done on Xilinx Foundation series soft¬ 
ware which was available at Naval Postgraduate School (NPS). Voters and an error en¬ 
coder were designed and explained in detail. Other issues including interrupt routine and 
memory/error controller were left as follow-on research. 
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In the FPGA section, different FPGAs were compared in a number of aspects. 

Five major parameters for choosing a good FPGA were gate count, availability of hard¬ 
ware and software, packages (flat-pack vs. ball-grid-array), re-programmablility and ra¬ 
diation tolerance. The Xilinx XCV800 was chosen as the candidate at that time for future 
implementation. 

B. EBERT’S RESEARCH 

A complete CFTP conceptual design presented was in Dean A. Ebert’s thesis [9]. 
For hardware considerations, his thesis discussed why specific components were chosen 
and how chips communicated in an integrated circuit. More detail and realistic concepts 
about FPGA and CFTP configurations were described than before and chips were se¬ 
lected based on a number of space-environment considerations. Discussion of system 
memory was important and first described in this thesis. Memory configuration control¬ 
ler, functional logic and glue logic were also new ideas never talked about in previous 
work. The TMR circuitry was not one of the main topics in his research, but from his 
work one can visualize the external connections of the FPGA and understand the role of 
TMR in the CFTP process. Figure 4 illustrates the layout of the board he developed. 



Figure 4. CFTP Conceptual Diagram (From Ref. [9].) 
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The CFTP will be launched into LEO orbit on two satellites, NPSAT-1 and Mid- 
STAR-1, in 2006. How the Department of Defense and Navy Space Experiment Review 
Board (SERB) and the Space Test Program (STP) Office were involved with these two 
satellites was described in his thesis. Other documents related to design descriptions and 
requirements of the STP office were attached as appendixes as well. 

C. JOHNSON’S IMPLEMENTATION 

Steven A. Johnson [5] focused his work on TMR design. The essential compo¬ 
nents to make a circuit be fault-tolerant were identified. Circuits designed in Lashomb’s 
thesis could not be used due to different design architecture and the significant upgrade of 
computer-aided-design software employed. Basic concepts for constructing a TMR cir¬ 
cuit were still the same, but implemented in a different way. 

KDLX, a 16-bit processor, better than 8-bit KCPSM processor, was the processor 
used in Johnson’s research. His design consisted of tmra, Interrup, Error Syndrome Stor¬ 
age Device (ESSD) and Reconciler. The block named tmra consists of three KDLX 
processors and six voters. All processor output signals have to be voted. Interrup was 
compiled in a state diagram and used to trigger the interrupt service routine to correct an 
error inside the KDLX. ESSD was used to save the error syndrome in order to offer a log 
file for analysis. The KDLX is a Harvard architecture device which has two address 
buses and two data buses, a set of address and data bus for instruction memory and an¬ 
other set for data memory. The off-chip memory for the CFTP is Von Neumann architec¬ 
ture. The Von Neumann architecture has only one address bus and one data bus. Due to 
this difference, a Reconciler was designed to coordinate different timing constraints in 
order to make a proper read and write on memory. The difference between Harvard and 
Von Neumann architecture will be explained again while introducing KDLX in Chapter 
IV. 

Johnson’s full design schematic is shown in Figure 5. The memory is external to 
FPGA and it should be connected to Reconciler located at the top left corner. Normally, 
tmra communicates with Reconciler in order to access memory. Meanwhile, the syn¬ 
drome data is latched into ESSD regardless of an error occurring or not. When an error 
occurs, a signal will be sent to Interrup and starts the Interrupt Service Routine (ISR). At 
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this moment, KDLX is stalled and ESSD saves the error syndrome to memory through 
Reconciler. Then Interrup generates a TRAP instruction to KDLX and leads the whole 
circuit into an error correction condition. When KDLX sees the TRAP instruction, it 
jumps to a specific memory location and the program counter value before the jump is 
saved in an interrupt address register (IAR), a special register inside KDLX. In the error 
correction condition, the contents of all registers inside KDLX are saved to memory 
through voters. Then, each register is reloaded from memory. The purpose for doing this 
step is to correct any inconsistencies of the registers in all three KDLX processors. Since 
all contents have to pass voters while saving, any error inside any register will be cor¬ 
rected. 

The last instruction in ISR is Return From Exception (RFE). This instruction in¬ 
dicates the end of ISR and the program counter saved in IAR will be loaded back to the 
KDLX. The logic gate set at the bottom in Figure 5 is a simple encoder of the RFE in¬ 
struction which tells Interrup to stop the ISR. Finally, the whole circuit goes back to its 
nonnal operation. 

This circuit primitively illustrated the complexity of the design and was built 
based on theory. Simulations and timing problems were left as follow-on research. It 
was proved on software that with such huge circuit built inside, the XCV800 FPGA still 
had a plenty of space and I/O blocks available. 

D. CHAPTER SUMMARY 

This chapter introduces work done by previous graduate students to give a direc¬ 
tion where other resources are. This thesis mainly focuses on the TMR design and fol¬ 
lows concepts in Lashomb and Johnson’s research. The primitive design has been done 
and general concepts have been given. The Interrup takes over the whole circuit when an 
error occurs. Specific locations in memory are reserved for ISR and storing error syn¬ 
dromes. No other instructions should be able to access these locations. 

In the next chapter, the testing environment and ISE software are introduced. De¬ 
veloping a consistent testing environment is important in order to have the right compari¬ 
son. A description of software tools is also often useful information for a reader. This 
helps people understand more about simulation. 
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Figure 5. Full TMR Design Schematic (From Ref. [5].) 
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III. TESTING ENVIRONMENT AND ISE SOFTWARE 


It is hard to build a circuit without simulating it since that is the cheapest and fast¬ 
est way to verify if a design works or not. The software used for simulation and the one 
used for constructing circuits do not need to be made by the same company. Different 
programs may use different ways to compile code or run simulations. A circuit built via 
some specific functions offered in one program may not fit into other programs. There¬ 
fore, a designer using programs made by different persons or companies sometimes face 
the problem of incompatibility. This issue can be solved if a package of service is 
bought. Generally speaking, products made by the same company are more compatible 
with each other and it is easier for that company to provide complete customer services. 

Simulation is a very important component of design. A good design without a 
proper simulation may have degraded performance or efficiency. Sometimes inaccurate 
simulation results can mislead a designer into modifying something which is not sup¬ 
posed to be modified. A good simulation result could not only prove one’s design but 
also help others understand the concept one embodies in a design. In terms of thesis re¬ 
search, simulation helps the designer and others to verify the design without spending too 
much time. Follow-on students can simply rerun the program and prove the consistency. 

All settings of test benches for simulations will be offered in this thesis. This kind 
of information is usually not available on a lot of testing or simulation. Providing the 
simulation result without providing parameters means that others may not be able to un¬ 
derstand the testing backgrounds and may prevent people from building an identical test 
bench. This is not important for a reader on the web, but it is important for a graduate 
student working on a thesis. First, a program sometimes crashes and files will be lost for 
some reasons which means someone may never get the same simulation outputs. Second, 
a modified circuit sometimes needs a new test bench for it. Without those parameters, 
simulation will be done under different testing environments and performance improve¬ 
ment may not be proved. 
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A. COMPUTER SPECIFICATIONS 

System performance is often an important factor for testing. Running a program 
on a slow machine takes longer time than on a fast machine but the program result should 
be the same. When considering timing issues, performance of a system can be an impor¬ 
tant role. A slow computer basically cannot handle large amount of data and sometimes 
forces a user to reboot. As the TMR design gets more complicated, simulation will take 
longer for sure. The speed of how many data per second that a system can handle may 
affect the accuracy of simulation. Specifications of testing environment are always stated 
in a lot of computer magazines especially when testing a new hardware perfonnance. 

The TMR design so far is not so complicated that it needs a high performance computer 
to simulate it. The information offered in Table 3 can be used as a reference in future 
thesis work. 


Model 

IBM ThinkPad A31 (2652Q5U) 

Processor 

Pentium® 2 3 4 2.0 GHz 

Memory 

1 GB PC2100 DDR SDRAM 

Hard Drive 

40 GB 4200 RPM 

Operating System 

Windows 2000 Professional 

OS version 

5.0 Service Pack 3 

Video Card 

Mobility Radeon 7500 AGP 


Table 3. Computer Specifications for Simulation 


B. XILINXISE SOFTWARE 

The software used for constructing TMR design is a package called ISE made by 
Xilinx®3, one of the largest FPGA manufactures in the world. This software is available 
at NPS and is used in labs for some courses. Students who want to do FPGA design 
should have basic understanding of this program. In order to do this research, it was nec¬ 
essary to leam about ISE and its associated simulator from the Xilinx website [10], an in- 
depth tutorial [11] or personal experience. 


2 Pentium is a registered trademark of Intel Corporation. 

3 Xilinx is a registered trademark of Xilinx Corporation. 
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ISE 5.2.03i was the version used for this thesis. Project Navigator was the overall 
controller of the ISE design system. The other important program used in this thesis 
called ModelSim® 4 is a powerful simulation tool. Its full version name is ModelSim XE 
II 5.6e. Logos of Project Navigator and ModelSim are shown in Figures 6 and 7. 


£ XILINX 8 

DESIGN SOLUTIONS 
Project Navigator 

Release Version: 5.2.03i 
Application Version: build+F-31+0 
Registration ID: 135766720478 
Copyright (c) 1995-2002 Xilinx, Inc 
All rights reserved. 



Figure 6. Xilinx ISE Project Navigator Logo 



Figure 7. Xilinx ISE ModelSim Logo 


The FPGA selected for CFTP was a Xilinx Virtex XCV800 hq 240 with speed 
grade of-4. This is an FPGA with 800 gate equivalents, in a package with 240 pins. 
Thus using ISE to develop and simulate the TMR design should be able to achieve the 
best design and the most realistic simulation of any other programs. 

While this research was being performed, Xilinx released a new version of ISE 
6.1 i to its customers. Xilinx has warned that loading a project made in an old version of 


4 ModelSim is a registered trademark of Mentor Graphics Corporation. 
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ISE into ISE 6.1 i will make an unrecoverable change and the project can no longer be 
read by older ISE software. Since a lot of simulations have been done at this moment and 
in order to keep the consistency of all testing environment, simulation on the latest ver¬ 
sion is left as a part of future work. 

C. CHAPTER SUMMARY 

This chapter summarized hardware and software information along with simula¬ 
tion environment. Simulation may look different in different software versions and 
sometimes new error will be generated. Undiscovered errors or potential defects of a de¬ 
sign may be pointed out in the new version software. Sometimes the difference between 
new and old program is described in the user guide or on company’s website. It is good 
to know primary evolution on new software and expect changes on old design. Work be¬ 
comes efficient if one can exploit a program’s features and functions. 

Components in TMR design will be introduced in following chapters. Before 
constructing a full design, each circuit is built and tested. Therefore, simulation results 
will be used to explain how a circuit functions. 
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IV. KDLX INTRODUCTION 


The KDLX, a 16-bit processor, is the kernel of this TMR design. Each compo¬ 
nent in the design is connected with a KDLX processor and tested as the final procedure. 
The KDLX is the soft-core processor to be used for each of the three processors in the 
design of the TMR system as shown in Figure 3. Due to the features of the KDLX pipe¬ 
line and wiring delays, a circuit that works in a test bench by itself sometimes does not 
work with a KDLX. Knowing KDLX helps a designer foresee problems when building a 
circuit with it. Therefore, understanding KDLX is the first step for constructing a TMR 
design. 

A. INSIDE KDLX 

The KDLX is coded in VHDL, VHSIC (Very High Speed Integrated Circuit) 
Hardware Description Language. It is composed of two top-level blocks, core and 
10 Pads, as shown in Figure 8. The core and 10 Pads are names of blocks; corel and 
10 Pads 1 are local block names representing core and 10 Pads, respectively, in the 
VHDL file called “dlx.vhd”. The word KDLX at the top right corner is the name of the 
outer block. Numbers next to input and output pins represent the width of the bus. 

Words in bright green are local signals and none of the interconnections between these 
local pins are accessible from the outside (e.g., the connection between InData on 
10 Pads 1 and Input data on corel). All pins on the left side are input signals and all 
pins on the right side are output signals, except the Data bus. Controlled by 10 Pads 1, 
the data bus on KDLX is bi-directional. It sends out data when writing to memory and 
stays high impedance otherwise. High impedance allows other devices connected on the 
data bus to drive the bus, but data will not be accepted by KDLX at this moment even if it 
flows inbound. The dash line in sky blue inside 10 Pads 1 is an internal connection. This 
internal connection functions only when input signal Out En n is low. 

Notice that most input and output pins of KDLX are the same as corel. The func¬ 
tion of 10 Pads 1 is to interface the external bi-directional data bus to input data and out¬ 
put data buses on corel. To understand KDLX better, the core needs to be explored. 
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Major functional blocks are all inside core and are shown in Figure 9. These 
blocks are zero Jest, pipeline, regfile, pc_control, rwcontrol, alu, word regsingle, 
word mux3 and word mux4. 
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Figure 9. Inside core 
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The local block name used in the file “core.vhd” is boxed at the top of each func¬ 
tion block. Words in bright green are still local signals and those in sky blue represent 
global signals only within the core. They are considered global signals because most 
blocks have these signals and they all receive the same value. For instance, all blocks re¬ 
ceive zero when signal resetn is low. When the global signals Shift_En is low, local 
block pipeline_l may invert this signal to high internally and use it to trigger other func¬ 
tions. Therefore, Shift_En low in the core does not mean this signal is low inside pipe¬ 
line _1. That is why global signals are used for the core only. 

The detailed functioning of each block is described in KDLX’s VHDL code. Fig¬ 
ures 8 and 9 are plotted directly from the original VHDL code to illustrate how these 
components connect. Functions of important components like alu, regfile, pc control, 
rw_control and pipeline are briefed here. Simulation of KDLX later will verify these 
functions. 

1. Function of alu 

This block is able to do addition, logic computation, and barrel shifting. Subtrac¬ 
tion can be achieved by adding a positive number with a negative number. KDLX uses 
2’s complement arithmetic to do calculation. A simple 8-bit 2’s complement number ta¬ 
ble is shown in Table 4. 


Binary number 

Equivalent Decimal number 

1 

1 

1 

1 

1 

1 

1 

1 

127 

0 

0 

0 

0 

0 

0 

1 

1 

3 

0 

0 

0 

0 

0 

0 

1 

0 

2 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

1 

1 

1 

1 

-1 

1 

1 

1 

1 

1 

1 

1 

0 

-2 

1 

1 

1 

1 

1 

1 

0 

1 

-3 

1 

1 

1 

1 

1 

0 

1 

1 

-4 

1 

0 

0 

0 

0 

0 

0 

0 

-128 


Table 4. 2’s Complement Numbers 
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Logic computation includes logic AND, OR and XOR functions. KDLX allows a 
user to do logic computation between contents of two registers or the contents of a regis¬ 
ter and an immediate value. 

A built-in barrel shifter gives KDLX the ability to do logic or arithmetic shifting. 

2. Function of regfile 

All 15 registers of KDLX are in this block. The inbound data bus is connected to 
all registers and an enable bus is used to control which register is being written. Two big 
muxes, MUXA and MUXB, route the output of a selected register to the outbound data 
bus. 

3. Function of pc control 

The program counter sends the address to the instruction memory in order to fetch 
an instruction for next step. The pc_control assumes an important role while executing a 
Branch, Jump or TRAP instruction. For some instructions like Jump and Link, 
pc_control will save the return address of the instruction that comes after the next 2 in¬ 
structions. This is because KDLX is pipelined, and, therefore, two instructions after the 
Jump will be executed before the jump occurs. The return address is saved in register 15. 
Since no instruction in KDLX is able to read the return address in register 15 directly, 
another circuit needs to be constructed in order to jump back to where the Jump and Link 
instruction left off. 

Another important component in pc_control is the interrupt address register (IAR) 
which has been mentioned in Johnson’s implementation. IAR is a register not accessible 
for a user. This special register is merely used to save the return address of the TRAP in¬ 
struction. When the TRAP instruction is executed, the return address (which is the ad¬ 
dress right after the next 2 instructions) is saved into the IAR. After this, the program 
counter jumps to another memory location and start reading another set of instructions. 
Another instruction named Return From Exception (RFE) will be at the end of the in¬ 
struction set. RFE will read the IAR and jump back to the memory location indicated. 
The jump, branch and trap implementations will be discussed again while simulating 
KDLX in this chapter. 
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4. 


Function of rw control 


Obviously this is where KDLX controls read, write and program read signals for 
the memory modules that are attached to it. An important point here is that the KDLX 
read and write signals are active low. This means these two signals are activated at the 
falling edge of clock. 

5. Function of pipeline 

Inheriting the nature of DLX, the KDLX is a five-stage pipelined processor, i.e., 
Fetch, Decode, Execute, Memory and Write Back. At the Decode stage, signals used to 
select registers in regfile are assigned. At the Execute stage, eight instructions are spe¬ 
cific monitored. These eight instructions are Jump, Jump and Link, Branch if Equal 
Zero, Branch if Not Equal Zero, RFE, TRAP, Jump Register and Jump Register and 
Link. At the Memory stage, the signals are generated to allow the KDLX to read from or 
write to memory. The last stage, Write Back stage, allows most of the instructions to 
write to registers except some specific ones. 

6. KDLX Summary 

Thankfully, the ISE software has the ability to transfer VHDL code to a schematic 
so the user has an option to study a circuit without understanding VHDL code. The 
Schematic is more graphical than code and allows people to physically see how circuit is 
wired. The schematic symbol of KDLX is shown in Ligure 10. 


dlx 



Figure 10. Schematic Symbol of KDLX 
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a. Inputs and Outputs 

As mentioned earlier, KDLX has four inputs, five outputs and one bi¬ 
direction bus. Four inputs are three 1-bit pins, i.e., clock in, resetn and stalln, and one 
24-bit instruction bus. Five outputs are three 1-bit pins, i ,e.,prog_rd, rd and wr, and two 
16-bit buses, i.e., addr_int(15:0) and pc(15:0). The only bi-directional bus is a 16-bit 


data bus. Functions of these pins are listed in Table 5. 


Symbol 

Signal Name 

Function 

clock in 

Clock input 


resetn 

Reset 

Reset KDLX when low. All register contents are 
cleared. 

stalln 

Stall 

Stall KDLX when low. Stall everything including 
data in pipeline stage. 

instr(23:0) 

Instruction Bus 

Receive instructions sent from instruction memory. 

prog rd 

Program Read 


rd 

Read 

Read data from data memory when low. 

wr 

Write 

Write data to data memory when low. 

addr int( 15:0) 

Data Address 

Send data address to data memory. 

pc(15:0) 

Program Counter 

Send instruction address to instruction memory. 

data(15:0) 

Data Bus 

Receive data from data memory or send data out to 
data memory. 

Table 5. 1 

"unction of Pins on KDLX 


b. Harvard Architecture and Von Neumann Architecture 

KDLX is a Harvard architecture device that has a pair of address and data 
buses for instruction memory and another pair for data memory. Figure 11 illustrates the 
concept of this architecture. The device at the center sends the address of instruction to 
an instruction memory. Then the instruction memory on the left will send an instruction 
back to the device. If the instruction received is to read or write data to data memory, the 
device at the center will send a data address to the data memory at the right side to indi¬ 
cated the memory location it wants to read or write. If the device wants to read, the data 
bus will be driven by data memory and data is sent from data memory to the device. If 
the device wants to write, the data bus will be driven by the device and data is sent from 
the device to data memory. 
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Instruction 
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Data 

Memory 
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instruction 
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data 


Figure 11. Harvard Architecture 


By applying the same concept to KDLX, a picture like Figure 12 is under¬ 
standable. 


Instruction 

pc(15:0) 


addr_int(15:0) 

Data 


KDLX 


Memory 



Memory 

instr(23:0) 


~data(15:0)"~ 


Figure 12. KDLX Connections with Two Memories 


The Von Neumann architecture, on the other hand, has only one address 
bus and one data bus. A single memory is used in this architecture. A processor using 
Von Neumann architecture has less timing issues that need to be solved with memory 
since they are the same architecture. A Harvard-architecture processor, e.g., KDLX, 
needs to deal with possible timing mismatches with memory if only one memory is avail¬ 
able. In the CFTP design, only one memory is available for the TMR circuit thus it is an 
instruction memory and a data memory as well. Recall that a component in Johnson’s 
implementation (called Reconciler ) is such a device used to integrate these two different 
architectures. 

In order to consolidate a four-bus processor with a two-bus memory, the 
memory has to run in double speed to support two accesses per clock cycle. Figure 13 
shows how KDLX communicates with only one memory. 



Figure 13. KDLX with One Memory 
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Since KDLX is a pipelined processor, it needs to be able to read or write 
data at the time it fetches an instruction. Both of these events can happen in one KDLX 
clock cycle. If the memory is twice as fast as the KDLX, it is able to deal with instruc¬ 
tion at the first memory clock cycle and deal with data at the second memory clock cycle. 
In Figure 13 ,pc(15:0) and instr(23:0) are done in the first memory clock cycle; 
addr_int(15:0) and data(15:0) are done in the second memory clock cycle. Lhe memory 
used here needs to be a 24-bit memory due to the width of instruction bus. Because the 
KDLX data bus is only 16-bits wide, only the lower 16-bit data will be accepted and the 
rest are buffered out. 

B. PIPELINE CONCEPTS 

The KDLX is a five-stage pipelined processor. These five stages are Fetch, De¬ 
code, Execute, Memory (Mem) and Write Back (WB). When doing a write, data is writ¬ 
ten to a register at the third clock cycle, i.e., the Execute stage. Therefore, a destination 
register used in one instruction is not available until 2 clock cycles later. This concept 
has significant impacts when creating a test bench. Figure 14 shows the pipeline execu¬ 
tion of KDLX in normal operation. 


Instruction 

number 

1 2 

3 

4 

Clock cycle 
5 

6 

7 

8 

9 

Instruction 1 

Fetch Decode 

Execute 

Mem 

WB 





Instruction 2 

Fetch 

Decode 

Execute 

Mem 

WB 




Instruction 3 


Fetch 

Decode 

Execute 

Mem 

WB 



Instruction 4 



Fetch 

Decode 

Execute 

Mem 

WB 


Instruction 5 




Fetch 

Decode 

Execute 

Mem 

WB 


Figure 14. Pipeline Execution in KDLX 


In Figure 14, if Instruction 1 is loading data from the memory to register 3 (for 
example), the action to load register 3 starts at clock 3 and ends at clock 5 which means 
register 3 should not be accessed as a source register in Instruction 2, 3 and 4. Failing to 
do so, Instruction 2, 3 and 4 will either fetch a wrong value or unidentified data. Thus a 
new value of register 3 is only available for an instruction equivalent to or later than In¬ 
struction 5. 
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c. 


MEMORY IN SIMULATION 


All components generated for TMR design were simulated with KDLX and mem¬ 
ory as the final step. The ISE software has several different kinds of RAM or ROM in 
schematics for users to choose. A designer can also construct a memory via VHDL code. 
Another function called the CORE generator (Coregen) is a graphical interactive design 
tool in ISE software to help a user design a module. Due to its simplicity, memory used 
in this thesis was generated from Coregen. 

A 24-bit memory with its simulation result is shown in Appendix A, section A. In 
order to explain, a copy of this simulation was made and labeled as Figure 15. 
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Figure 15. 24-bit Memory Simulation Result 

Values on the address bus and input data bus are assigned in the test bench. In 
this simulation, memory is being written at point 1. The first value (i.e., 000047i6) is 
written into memory location 00i6 and the second value (i.e., 00004Ci6) is written into 
memory location 01 16 and so on. At point 2, memory starts being read and all values are 
output as originally initiated. One of the features of this memory is that data sent to 
datajn bus for writing comes out at the data_out bus. A designer can monitor the data 
written into memory from here. The write enable signal of this memory is active low; 
therefore it reads when this signal is high. 

Memory used in simulation can be a RAM or ROM. A ROM is used as an in¬ 
struction memory which is not allowed to be written. A RAM can be initialized by writ- 
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ing it before using it, but a ROM cannot since it does not have a write enable pin. Thus, a 
ROM needs to be pre-configured. In the ISE software, a user needs to generate a coe fde 
and load it before a memory is generated in Coregen. 

Memory offered in ISE software is not a real Von Neumann architecture since it 
has separate buses for data input and output. For simplicity, the TMR design in this the¬ 
sis uses this kind of memory. Further modification is needed when a real Von Neumann 
architecture memory is available. 

D. KDLX SIMULATION WITHOUT MEMORY 

Operation codes (Opcodes) for the instruction set are described in Appendix B. 
This appendix includes all instructions that can be implemented in KDLX. Simulation of 
all instructions is one of the best ways to understand how KDLX functions. Before doing 
that, a simple simulation on KDLX itself is shown in Appendix A, section B. Figure 16 
is a copy of this simulation result for explanation. All registers in the KDLX are initial¬ 
ized to the value 0000 and register 0 is always zero. 
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Figure 16. KDLX Simulation 


In Figure 16, the first instruction at point 1 represents loading the value at mem¬ 
ory location [(register 0)+05] into register 3. One can find a read signal becomes low at 
point 2. Comparing the timing here with Figure 14, it is proved that the action on the reg¬ 
ister occurs at Execute clock cycle. Since two values, 00 14 i 6 and 0015 16 , are already 
available on the bus, KDLX loads these two data into register 3 and register 5, respec- 
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tively. Recall that the pipeline features discussed in Figure 14, the new content of regis¬ 
ter 5 is not available at any clock cycle before point 3. Using register 5 anywhere before 
point 3 will use the old value in register 5 which is 0000i6 in this case. In this simulation, 
three NOP are inserted before using register 5. 

At point 3, instruction 450507i6 stands for storing the content of register 5 to the 
memory location [(register 0)+07]. Again, the action starts at point 4 which is the Exe¬ 
cute cycle for this instruction and the value loaded before shows up on the data bus. 

Since the data bus is high impedance at this clock cycle, the KDLX is able to drive the 
bus and output data. Without a high impedance, the KDLX is not able to use the bus be¬ 
cause it assumes someone is using it. By checking the address bus of the KDLX simula¬ 
tion, one can find how the instruction and address correspond with each other. 

The two instructions following the store instructions are 413408 1 6 and 415601 ig. 
These add immediate values to register 3 and 5, respectively, thus the data inside register 
3 and 5 changes. This can be seen at point 5 when these two register contents are stored 
again. 

For the rest of this thesis, we will use assembly language mnemonics to refer to 
instructions. For example, a register is represented by R. Thus, R0 stands for register 0 
and R1 means register 1. Instead of a long explanation of each instruction, the operation 
symbol will also be used in following contents. An instruction like 440305 16 will be rep¬ 
resented as LW R3<— Mem(R0+05). The symbols and expressions are defined in Appen¬ 
dix B. 

E. KDLX SIMULATION WITH MEMORY 

There are a total of 42 instructions for KDLX. Understanding these instructions is 
necessary to generate a test bench for the TMR processor. Utilizing different combina¬ 
tions of instructions can also help a designer use a short test bench to achieve the same 
goal of simulation. Instead of loading a large number of instructions into instruction 
memory before testing, pre-configured memory is used. Simply by selecting a different 
memory file, the same test bench can be used to test different instruction set; otherwise, 
several test benches are needed for different instruction set. 
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Instead of testing all instructions in one huge test bench, the 42 instructions were 
separated into four different instruction sets. Instruction set 1 and 2 test arithmetic and 
logic functions. Instruction set 3 and 4 test Jump, Branch and TRAP functions. 

The schematic designed for this testing is shown in Figure 17. Memory at left 
side is a ROM used as instruction memory. The other one at right side is data memory 
which is a RAM. The addr box contains only buffers used to truncate the width of the 
address bus since the memory address for this design is only 8-bits wide. Data memory 
is pre-configured with 0003 16 since some numbers need to be loaded into registers at the 
beginning of simulation. 
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Figure 17. KLXD with Instruction and Data Memory 
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The write signal on KDLX is connected directly to data memory in order to be 
able to write memory. Since KDLX uses a bi-directional data bus, buffers with enable 
pin are needed to control the direction of data flow. Read and write signals are used to 
enable or disable these buffers. Extra output buses are added for monitor purposes. All 
test benches and simulation results are in Appendix A, section C. 

1. Implementation Table of Instruction Set 1 

An implementation table is generated as Table 6. Constructing such an instruc¬ 
tion test bench can take a lot of time since instructions need to be rearranged and simula¬ 
tion results need to be checked. Instructions tested in each set are not many, but a num¬ 
ber of loading and storing instructions are needed to check the data. All numbers in Ta¬ 


ble 6 are hexadecimal and RO is always zero. 


Instruction (operation symbol) 

Opcode 

Value through Data Bus 

LW 

Rl<—Mem(R0+03) 

440103 


SW 

Rl—»Mem(R0+08) 

450108 

0003 

LW 

R2<—Mem(R0+04) 

440204 


SW 

R2—»Mem(R0+09) 

450209 

0003 

ADD 

R1+R2—>R3 

011320 


SW 

R3—>Mem(R0+0D) 

45030D 

0006 

ADDI 

R1+ext(F9)—>R4 

4114F9 


SW 

R4—>Mem(R0+0E) 

45040E 

FFFC 

ADDUI 

R1+(0A) —>R5 

21150A 


SW 

R5—>Mem(R0+0F) 

45050F 

000D 

AND 

R1*R3—>R6 

091630 


SW 

R6—»Mem(R0+10) 

450610 

0002 

ANDI 

R4*(FD)—>R7 

2947FD 


SW 

R7 —»Mem(R0+11) 

450711 

00FC 

LHI 

R8<—FF (0) 8 

0808FF 


SW 

R8—»Mem(R0+12) 

450812 

FF00 

OR 

R1+R3—»R9 

0A1930 


SW 

R9—»Mem(R0+13) 

450913 

0007 

OR I 

R1+(F0)—>R10 

2A1AF0 


SW 

RIO—»Mem(R0+14) 

450A14 

00F3 

SEQ 

R1=R2—»R11=1 

181B20 


SW 

Rll->Mem(R0+15) 

450B15 

0001 

SEQ 

R1*R3^R12=0 

181C30 


SW 

R12->Mem(R0+16) 

450C16 

0000 

SEQI 

R1=(0003)^R13=1 

581D03 


SW 

R13->Mem(R0+17) 

450D17 

0001 

SEQI 

R1*(0004)^R14=0 

581E04 
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Instruction (operation symbol) 

Opcode 

Value through Data Bus 

SW 

R14—>Mem(R0+18) 

450E18 

0000 

SLL 

R 4 ^ r2= (00°3) — >R 1 5 

114F20 


SW 

R15 —»Mem(R0+19) 

450F19 

FFE0 

SLLI 

R4 <— (0005)— ^ R3 

514305 


SW 

R3 — >Mem(R0+1 A) 

45031A 

FF80 

SRA 

r 4 ^ R 1 =( 0003)_^ R5 

134510 


SW 

R5 —»Mem(R0+ IB) 

45051B 

FFFF 

SRLI 

R4 “ *(0003) — ^ R6 

524603 


SW 

R6-*Mem(R0+1C) 

45061C 

1FFF 

SUBI 

R8-ext(7B)^R7 

43877B 


SW 

R7^Mem(R0+lD) 

45071D 

FE85 

XOR 

R9©R10^R11 

0B9BA0 


SW 

Rll^Mem(R0+lE) 

450B1E 

00F4 


Table 6. 

Instruction Set 1 



There are four sections in this map. Instructions for loading or computing data 
are implemented first in each section. Instructions for storing are used for checking data 
and are implemented later. The third column lists all Opcodes for implementing and the 
fourth column shows all data that should come out on the data bus. 

2. Simulation Result of Instruction Set 1 

To see the difference with the simulation of KDLX only, part of the simulation 
results is shown in Figure 18. 



Figure 18. Simulation of KDLX with Memory 
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In order to make sure that the memory is stable before KDLX is going to use it, 
the memory clock cycle is doubled. The instruction memory will be ready before KDLX 
reads the instruction. The data memory will write data in a very short time and always be 
ready to be read by the KDLX. 

Comparing timing before and after KDLX connects with the memory, a delay of 
the read and write operation can be found. In Figure 18, the instruction at point 1 does 
not start the write until point 2. Without the memory, this signal should be about one-half 
clock cycle earlier than point 2. This difference is due to the timing delays from the 
connecting memory. The fourth cycle of the KDLX clock is Mem which means that the 
KDLX is accessing memory at this time. 

Another delay shows on instruction fetching. (Recall the schematic in Figure 17.) 
The program counter of KDLX sends out an instruction address to the instruction mem¬ 
ory. Then the instruction memory reads the program counter and sends out an instruction 
to KDLX. This delay makes each instruction in Figure 18 start at the falling edge of 
clock. This is not like the instruction in Figure 16 which starts at the rising edge. The 
same delay happens when KDLX reads from or writes to the data memory. 

The pipeline feature can also be seen in Figure 18. While KDLX is still sending 
out data, it is simultaneously fetching a new instruction. 

An alternative way to check the simulation result is to construct tables for memo¬ 
ries and registers as shown in Table 7. The instruction memory is pre-conligurcd as the 
first table at the left. The second table shows how the contents of registers change in the 
simulation. The third table at the right expresses values in different locations after the 
simulation is done. Blank areas in data memory will contain the default value 0003 16 . 

In the instruction memory, a series of store instructions is used to check the con¬ 
tents in registers. A series of load instructions is used to check the contents in the mem¬ 
ory locations. The first six Opcodes implement the instructions in section 1 of Table 6. 
Then the Opcodes from memory locations 08 to 10 execute the instructions in section 2 
of Table 6. All instructions for loading and computation are executed before storing to 
memory. The instruction sequence in Table 6 is used to track which part of the instruc¬ 
tions are checked when storing. 
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Instruction Mem 

00 


2D 

45071D 

01 

440103 

2E 

450B1E 

02 

440204 

2F 

000000 

03 

000000 

30 

000000 

04 

000000 

31 

000000 

05 

450108 

32 

450101 

06 

450209 

33 

450201 

07 

000000 

34 

450301 

08 

011320 

35 

450401 

09 

4114F9 

36 

450501 

0A 

21150A 

37 

450601 

0B 

000000 

38 

450701 

OC 

091630 

39 

450801 

0D 

45030D 

3A 

450901 

OE 

45040E 

3B 

450A01 

OF 

45050F 

3C 

450B01 

10 

450610 

3D 

450C01 

11 

2947FD 

3E 

450D01 

12 

0808FF 

3F 

450E01 

13 

0A1930 

40 

450F01 

14 

2A1AF0 

41 

000000 

15 

450711 

42 

000000 

16 

450812 

43 

000000 

17 

450913 

44 

44010D 

18 

450A14 

45 

44020E 

19 

181B20 

46 

44030F 

1A 

181C30 

47 

440410 

IB 

581D03 

48 

440511 

1C 

581E04 

49 

440612 

ID 

450B15 

4A 

440713 

IE 

450C16 

4B 

440814 

IF 

450D17 

4C 

440915 

20 

450E18 

4D 

440A16 

21 

114F20 

4E 

440B17 

22 

514305 

4F 

440C18 

23 

134510 

50 

440D19 

24 

524603 

51 

440E1A 

25 

450F19 

52 

440F1B 

26 

45031A 

53 

44011C 

27 

45051B 

54 

44021D 

28 

45061C 

55 

44031E 

29 

43877B 

56 

000000 

2A 

0B9BA0 

57 

000000 

2B 

000000 

58 

000000 

2C 

000000 

59 

000000 


Register 

00 



01 

0003 


02 

0003 


03 

000© 

FF80 

04 

FFFC 


05 

000& 

FFFF 

06 

QQQ2 

1FFF 

07 

Q©EC 

FE85 

08 

FFOO 


09 

0007 


10 

00F3 


11 

0004 

00F4 

12 

0000 


13 

0001 


14 

0000 


15 

FFEO 



Data Mem 

00 


01 


02 


03 


04 


05 


06 


07 


08 

0003 

09 

0003 

OA 


OB 


OC 


OD 

0006 

OE 

FFFC 

OF 

000D 

10 

0002 

11 

OOFC 

12 

FFOO 

13 

0007 

14 

00F3 

15 

0001 

16 

0000 

17 

0001 

18 

0000 

19 

FFEO 

1A 

FF80 

IB 

FFFF 

1C 

1 FFFF 

ID 

FE85 

IE 

00F4 

IF 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


2A 



Table 7. Tables of Registers and Memories in Simulation 1 
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The Opcode, 4114F9i6, at memory location 09i6 implements ADDI 
Rl+ext(F9)—»R4. The original value of R1 is 0003i6 which equals to 3io- Since KDLX 
uses 2’s complement numbers, the sign extension value of F9i6 is FFF9i6 which is (-7) in 
decimal. The sum of 3 io and (—7)io is (-4)io. Convert (-4)i 0 to a binary number and do 
2’s complement, the result in hexadecimal is FFFC 16 . This agrees with the value in data 
memory location OE 16 . 

3. Implementation Table of Instruction Set 2 

The rest of the instructions (not including Jump and Branch) are listed in Table 8. 
This table only shows the instructions that were tested in this thesis. The table does not 
include the instructions for configuring memory contents. This will be explained further 


in the simulation section of this chapter. 


Instruction (operation symbol) 

Opcode 

Expected Value 

SGE 

R1>R3—>R13=T 

191D30 


sw 

R13—>Mem(R0+lF) 

450D1F 

0001 

SGE 

R15>R14—>R9=0 

19F9E0 


SW 

R9—>Mem(R0+20) 

450920 

0000 

SGEI 

R15>ext(E8)—>R10=0 

59FAE8 


SW 

RIO—>Mem(R0+21) 

450A21 

0000 

SGEI 

R15>ext(E0) —>R11=1 

59FBE0 


SW 

R11 —>Mem(R0+22) 

450B22 

0001 

SGT 

R4>R15—»R6=1 

1A46F0 


SW 

R6->Mem(R0+23) 

450623 

0001 

SGT 

R15>R4^R7=0 

1AF740 


SW 

R7->Mem(R0+24) 

450724 

0000 

SGTI 

R15>ext(FF)^R8=0 

5AF8FF 


SW 

R8->Mem(R0+25) 

450825 

0000 

SGTI 

R15>ext(87)^R9=l 

5AF987 


SW 

R9—>Mem(R0+26) 

450926 

0001 

SLE 

R1=R2^R10=1 

1B1A20 


SW 

RIO—>Mem(R0+27) 

450A27 

0001 

SLE 

R1<R13->R11=0 

1B1BD0 


SW 

Rll->Mem(R0+28) 

450B28 

0000 

SLEI 

Rl<ext(03)^R12=l 

5B1C03 


SW 

R12->Mem(R0+29) 

450C29 

0001 

SLEI 

Rl<ext(02)^R13=0 

5B1D02 


SW 

R13->Mem(R0+2A) 

450D2A 

0000 

SLT 

R15<R1^R6=1 

1CF610 


SW 

R6—>Mem(R0+01) 

450601 

0001 

SLT 

R1<R15^R7=0 

1C16F0 
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Instruction (operation symbol) 

Opcode 

Expected Value 

SW 

R7—>Mem(R0+02) 

450702 

0000 

SLTI 

Rl<ext(0D)—>R8=1 

5C180D 


SW 

R8—>Mem(R0+03) 

450803 

0001 

SLTI 

R1 <ext(01)—>R9=0 

5C1901 


SW 

R9—>Mem(R0+04) 

450904 

0000 

SNE 

Rl?fR2—»R10=0 

1D1A20 


SW 

R10—»Mem(R0+05) 

450A05 

0000 

SNE 

Rl?tR15—»R11=1 

1D1BF0 


SW 

R11—»Mem(R0+06) 

450B06 

0001 

SNEI 

Rl^ext(03)^R12=l 

581C03 


SW 

R12—>Mem(R0+07) 

450C07 

0001 

SNEI 

R15^ext(El)^R13=0 

58FDE1 


SW 

R13—>Mem(R0+08) 

450D08 

0000 

SRAI 

R 3 ^(ooo6)-> R6 

533606 


SW 

R6—>Mem(R0+09) 

450609 

FFFE 

SRL 

r3 ^R2=(0003)^ R7 

123720 


SW 

R7—>Mem(R0+0A) 

45070A 

1FF0 

XORI 

R15©(8A)—>R8 

2BF88A 


SW 

R8—>Mem(R0+0B) 

45080B 

FF6A 

SUBUI 

R3-(80)—»R9 

233980 


SW 

R9->Mem(R0+0C) 

45090C 

FF00 

SUB 

R1-R3^R14 

031E30 


SW 

R14->Mem(R0+0D) 

450E0D 

0083 


Table 8. 

Instruction Set 2 



4. Simulation Result of Instruction Set 2 

The complete table set that shows all values inside memories and registers for this 
simulation is shown in Table 9. In the instruction memory part of the table, the instruc¬ 
tions shown in Table 8 actually start at memory location 2Ai6. Instructions before this 
point are used to generate the same register values used in instruction set 1. The first col¬ 
umn of Table 9 shows values that are identical to the final results in Table 7. 

The registers change many times during this simulation, but the table only shows 
the initial and final values. The first column as described in the last paragraph is the 
starting data for instruction set 2. The second column lists all final values in registers. 

This simulation uses different data memory locations than instruction set 1. This 
provides a boundary test for memory while testing KDLX. 
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This instruction set demonstrates most of the possible comparisons between regis¬ 
ters or of a register with an immediate value. Since the KDLX uses 2’s complement val¬ 
ues, 0003 16 is obviously greater than FF80 i 6 . Logical operations like ANDI, ORI, and 
XORI do not use sign extension on an immediate value. 
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Instruction Mem 

00 


30 

450A21 

01 

410103 

31 

450B22 

02 

410203 

32 

1A46F0 

03 

0803FF 

33 

1AF740 

04 

0804FF 

34 

5AF8FF 

05 

0805FF 

35 

5AF987 

06 

08061F 

36 

450623 

07 

410380 

37 

450724 

08 

4104FC 

38 

450825 

09 

4105FF 

39 

450926 

0A 

2166FF 

3A 

1B1A20 

0B 

0807FE 

3B 

1B1BD0 

OC 

0808FF 

3C 

5B1C03 

0D 

080FFF 

3D 

5B1D02 

OE 

210AF3 

3E 

450A27 

OF 

217785 

3F 

450B28 

10 

210BF4 

40 

450C29 

11 

410907 

41 

450D2A 

12 

410D01 

42 

1CF610 

13 

410E00 

43 

1C17F0 

14 

410C00 

44 

5C180D 

15 

410FE0 

45 

5C1901 

16 

000000 

46 

450601 

17 

000000 

47 

450702 

18 

450100 

48 

450803 

19 

450200 

49 

450904 

1A 

450300 

4A 

1D1A20 

IB 

450400 

4B 

1D1BF0 

1C 

450500 

4C 

581C03 

ID 

450600 

4D 

58FDE1 

IE 

450700 

4E 

450A05 

IF 

450800 

4F 

450B06 

20 

450900 

50 

450C07 

21 

450A00 

51 

450D08 

22 

450B00 

52 

533603 

23 

450C00 

53 

123720 

24 

450D00 

54 

2BF88A 

25 

450E00 

55 

233980 

26 

450F00 

56 

031E30 

27 

000000 

57 

450609 

28 

000000 

58 

45070A 

29 

000000 

59 

45080B 

2A 

191D30 

5A 

45090C 

2B 

19F9E0 

5B 

450E0D 

2C 

59FAE8 

5C 

000000 

2D 

59FBE0 

5D 

000000 

2E 

450D1F 

5E 

000000 

2F 

450920 

5F 

000000 


Register 

00 



01 

0003 

0003 

02 

0003 

0003 

03 

FF80 

FF80 

04 

FFFC 

FFFC 

05 

FFFF 

FFFF 

06 

1FFF 

FFFE 

07 

FE85 

1FFO 

08 

FFOO 

FF6A 

09 

0007 

FFOO 

10 

00F3 

0000 

11 

00F4 

0001 

12 

0000 

0001 

13 

0001 

0000 

14 

0000 

0083 

15 

FFEO 

FFEO 


Data Mem 

00 


01 

0001 

02 

0000 

03 

0001 

04 

0000 

05 

0000 

06 

0001 

07 

0001 

08 

0000 

09 

FFFE 

OA 

1 FFO 

OB 

FF6A 

OC 

FFOO 

OD 

0083 

OE 


OF 


10 


11 


12 


13 


14 


15 


16 


17 


18 


19 


1A 


IB 


1C 


ID 


IE 


IF 

0001 

20 

0000 

21 

0000 

22 

0001 

23 

0001 

24 

0000 

25 

0000 

26 

0001 

27 

0001 

28 

0000 

29 

0001 

2A 

0000 


Table 9. Tables of Registers and Memories in Simulation 2 
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5. 


Implementation Table of Instruction Set 3 


This instruction set starts by testing the Jump and Branch instructions. The com¬ 
plete implementation is listed in Table 10. There are no divisions in this table and the se¬ 
quence of execution is from top to bottom. If an instruction jumps to the wrong memory 
location, one or all contents of the target registers will not agree with the expected value 
shown here. 



Instruction (operation symbol) 

Opcode 

Expected Value 

LW 

Rl<—Mem(R0+03) 

410103 


LW 

R2<—Mem(R0+04) 

410204 


LW 

R3<—Mem(R0+00) 

410300 


LW 

R4<—Mem(R0+06) 

410406 


BNEZ 

Rl^O—>Prog_Addr<—(05)+l+ext(04) 
Note: PC=05 and (05)+l+ext(04)=0A 

CO1004 


BEQZ 

R3=0—>Prog_Addr<—(0A)+ l+ext(04) 
Note: PC=0A and (0A)+l+ext(04)=0F 

C13004 


ADDI 

R0+ext(25)—>R5 

410525 


J 

(0020)—>Prog_Addr 

C80020 


JAL 

(0014)—»Prog_Addr ; (23)—>R15 
Note:(23) is return address 

E80014 


ADDI 

R0+ext(8A)—>R6 

41068A 


ADDI 

R0+ext(40)—>R7 

410740 


ADD 

R1+R2—>R8 

011820 


ADD 

R1+R4^R9 

011940 


SW 

R15—>Mem(R0+01) 

450F01 

0023 

JALR 

R5^Prog_Addr; (1D)->R15 
Noter:(lD) is return address 

685000 


J 

(0030)—»Prog_Addr 

C80030 


SW 

R5—>Mem(R0+02) 

450502 

0025 

SW 

R6—>Mem(R0+03) 

450603 

FF8A 

SW 

R7->Mem(R0+04) 

450704 

0040 

SW 

R8->Mem(R0+05) 

450805 

0007 

SW 

R9—>Mem(R0+06) 

450906 

0009 

SW 

R15->Mem(R0+07) 

450F07 

001D 

JR 

R7—>Prog_Addr 

487000 


SW 

R2—>Mem(R0+08) 

450208 

0004 


Table 10. Instruction Set 3 
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6. 


Simulation Result of Instruction Set 3 


For Jump and Branch instructions, the sequence of instructions in memory is not 
the sequence of implementation. This can be easily understood by looking at Table 11. 

The black arrows represent the nonnal sequence of operation. The blue dash lines 
stand for Jump or Branch instructions without link, and the blue solid lines stand for 
Jump and Link or Branch and Link. 

The first branch occurs at memory location 05 1 6 . Since the program counter at 
that point is 05 16 , it branches to memory location 0 Ai 6 with a given immediate value 04i6. 
The action of branching occurs two clocks later due to pipelining, so the instructions at 
memory location 0616 and 07 16 are fetched before the sequence branches to the new ad¬ 
dress. 

At memory location 0Ai6, another branch instruction is executed. It branches to 
another memory location, 0Fi6. Because the Opcode 410525 16 is fetched before the 
branch occurs, an immediate value is added into R5. This can be checked in the register 
table or in data memory location 02 i6 where Opcode 450502 i 6 loads data to. 

Opcode E80014i6 is a Jump and Link instruction. It jumps to address 14 i 6 and 
save address 23 16 into R15. There is no doubt that address 23 16 is where the jump occurs, 
not address 20i6, 21 [6 or 22 i 6 . In each case, the two instructions following Jump and Link 
are fetched before the jump instruction is executed. 

The instruction at memory location lAi6 is Jump Register and Link. This allows 
KDLX to read the address it wishes to jump to directly from its internal register. Sup¬ 
pose one register is reserved for a special purpose and it contains a special memory loca¬ 
tion. Then KDLX can always jump to that specific memory location by simply reading 
the contents of that register without any extra instructions needing to be implemented. 
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R e g iste r 

00 


01 

0003 

02 

0004 

03 

0000 

04 

0006 

05 

0025 

06 

F F 8 A 

07 

0040 

08 

0007 

09 

0009 

1 0 


1 1 


1 2 


1 3 


1 4 


1 5 




1 n stru ctio n M e m 



00 



r 

0 1 

410103 


1 

02 

410204 


u 

1 

03 

410300 


u 

1 

04 

410406 


..u 

i 

05 

C 01004 


u 

r - 

06 

000000 


u 

07 

000000 



08 




09 




0A 

C13004 


U. 

r - 

OB 

410525 


u 

OC 

000000 



OD 




OE 



1 

OF 

C 80020 


u 

r - 

1 0 

000000 


u. 

1 1 

000000 



1 2 




1 3 



r 

1 4 

011820 


u. 

1 

1 5 

011940 


1— 

1 

1 6 

4 5 0 F 0 1 


u 

r - 

1 7 

000000 


1— 

l 

1 8 

000000 


1— 

r - 

1 9 

000000 


L 

1 

1 A 

685000 


u 

r - 

1 B 

000000 


u 

1 C 

000000 



1 D 




1 E 




1 F 




20 

E 8001 4 


u 

1 

2 1 

4 1 068A 


1— 

22 

410740 



23 




24 



i 

1 

25 

C 80030 


u 

1 

26 

000000 


L* 

27 

000000 



28 




29 




2 A 




30 

450502 


u 

r - 

3 1 

450603 


U 

1 

32 

450704 


u 

1 

33 

450805 


u 

r - 

34 

450906 


U 

r - 

35 

4 5 0 F 0 7 


.U 

1 

36 

487000 


u 

r - 

37 

000000 


u 

38 

000000 
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40 

450208 


1 

4 1 

000000 


u 

r - 

42 

000000 


u 

43 

000000 



D a ta Mem 

00 


01 

0023 

02 

0025 

03 

F F 8 A 

04 

0040 

05 

0007 

06 

0009 

07 

001 D 

08 

0004 

09 


OA 


OB 


OC 


OD 


OE 


OF 


1 0 


1 1 


1 2 


1 3 


1 4 


1 5 


1 6 


1 7 


1 8 


1 9 


1 A 


1 B 


1 C 


1 D 


1 E 


1 F 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


2 A 



Table 11. Tables of Registers and Memories in Simulation 3 
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7. 


Implementation Table of Instruction Set 4 


This instruction set contains one of the most complicated instructions in the TMR 
design, which is the TRAP instruction. The TRAP instruction acts as Jump and Link or 
Branch and Link. The difference is that it saves its return address into the IAR, not into 
R15. The IAR is a specific register mentioned earlier when introducing the pc_controI 
inside KDLX. Storing the return address into the IAR not only saves a register but also 
guarantees the integrity since it is only accessible for the TRAP instruction. 

Another feature of the TRAP instruction is that it owns an instruction called Re¬ 
turn from Exception (RFE). The RFE, Opcode F 8 OOOO 16 , only reads the content of IAR 
and jumps to that address. Since the IAR always contains the return address of the TRAP 
instruction, the RFE instruction only works with the TRAP instruction. 

Instruction set 4 for testing the TRAP instruction is shown in Table 12. 



Instruction (operation symbol) 

Opcode 

Expected Value 

ADDI 

R0+ext(04)—>R1 

410104 


ADDI 

R0+ext(07)—>R2 

410207 


TRAP 

(0020)—>Prog_Addr ; (06)—>IAR 
Note: (06) is return address 

280020 


ADDI 

R0+ext(09)—>R3 

410309 


ADDI 

R0+ext(15)—>R4 

410415 


ADDI 

R0+ext(0A)—>R7 

41070A 


ADDI 

R0+ext(l 1)—>R 8 

410811 


ADDI 

R0+ext(C2)—>R 10 

410AC2 


RFE 

(06)—»Prog Addr 

Note: (06) is IAR 

F80000 


J 

(0011)—>Prog_Addr 

C80011 


SW 

Rl—»Mem(R0+01) 

450101 

0004 

sw 

R2—>Mem(R0+02) 

450202 

0007 

SW 

R3—»Mem(R0+03) 

450303 

0009 

sw 

R4—>Mem(R0+04) 

450404 

0015 

sw 

R7—>Mem(R0+07) 

450707 

000A 

sw 

R8—»Mem(R0+08) 

450808 

0011 

sw 

R10—>Mem(R0+0A) 

450A0 A 

FFC2 


Table 12. Instruction Set 4 
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8. 


Simulation Result of Instruction Set 4 


The features of the TRAP instruction are shown in Table 13. When fetching the 
TRAP instruction at memory location 03 16 , KDLX stores the return address 06i6 to the 
IAR. Two clock cycles later in the TRAP, the program counter changes to 20 i 6 and reads 
the instruction at that address. After implementing a few instructions, the KDLX sees the 
Opcode F80000 16 and retrieves address 06 1 6 for the return. The content at location 06 is a 
Jump instruction. Therefore, the KDLX jumps again to memory location 11 1 6 . 

Some important features can be found in this implementation. First, the TRAP 
occurs exactly after 2 clock cycles; otherwise the Opcode C80011 1 6 will be fetched. Sec¬ 
ond, the IAR is not directly addressable, so using Opcode F80000 i 6 is the only way to 
verify the content of the IAR. Third, instruction set 4 can be an infinite loop if the test 
bench never stops. After jumping to memory location 11 16 , the program counter keeps 
counting in order to read instructions. If no other signal stops the KDLX, it will read Op¬ 
code F80000i6 again. This retrieves the IAR and jumps back to memory location O 616 . 
The Opcode C80011 will lead KDLX to jumping to address 11 1 6 then to keep on read¬ 
ing instructions until it hits F 8 OOOO 16 again. This loop can be observed in the full simula¬ 
tion result for instruction set 4 in Appendix A, section C. 
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►IAR 


Instruction Mem 

00 


01 

410104 

02 

410207 

03 

280020 

04 

410309 

05 

410415 

06 

C80011 

07 

000000 

08 

000000 

09 


0A 


0B 


OC 


OD 


OE 


OF 


10 


11 

450101 

12 

450202 

13 

450303 

14 

450404 

15 

450707 

16 

450808 

17 

450A0A 

18 

000000 

19 

000000 

1A 

000000 

IB 


1C 


ID 


IE 


IF 


20 

41070A 

21 

410811 

22 

410AC2 

23 

000000 

24 

000000 

25 

000000 

26 

F80000 

27 

000000 

28 

000000 

29 


2A 





Data Mem 

00 

01 0004 

02 0007 

03 

0009 

04 

0015 

05 


06 


07 

000A 

08 

0011 

09 


OA 

FFC2 

OB 


OC 


OD 


OE 


OF 


10 


11 


12 


13 


14 


15 


16 


17 


18 


19 


1A 


IB 


1C 


ID 


IE 


IF 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


2A 



Register 

00 


01 

0004 

02 

0007 

03 

0009 

04 

0015 

05 


06 


07 

000A 

08 

0011 

09 


10 

FFC2 

11 


12 


13 


14 


15 



Table 13. Tables of Registers and Memories in Simulation 4 
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F. CHAPTER SUMMARY 

This chapter introduced several important components inside KDLX and dis¬ 
cussed pipeline concepts. Drawing a schematic from VHDL code is a good way to un¬ 
derstand KDLX. 

The simulation of KDLX with and without memory illustrated the concept of the 
pipeline and developed ideas on how to organize a test bench. Most of the tables neces¬ 
sary for simulation purposes were generated in this chapter. Having the tables generated 
before constructing a test bench helps a designer to understand what the goal is and how 
to achieve it. Tables created by the simulation gives a designer a big picture on how 
things interact with each other. Sometimes things are hard to say but easy to see. 

The TMR Assembly is designed in the next chapter. The function of the voter 
and how it corrects an error will be explained. Then we will combine three KDLX proc¬ 
essors with voters to form a TMR Assembly. Important simulation concepts will be re¬ 
viewed as well. 
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V. TMR ASSEMBLY 


The TMR Assembly is composed of three KDLX processors with voters on all 
outputs. All of the KDLX instructions have been tested in the simulation described in the 
previous chapter and the fundamental concept of KDLX has been established. The next 
step is to realize the function of a voter. 

A voter is constructed by some simple logic gates and is able to find an error 
when inputs are not consistent. Since the CFTP will be operating in a relatively benign 
LEO orbit, the TMR design does not have to deal with too many errors per unit time. 

The assumption of the TMR design is that we will not see identical errors on two proces¬ 
sors at the same time. The voters pass the majority vote so, if the errors are identical, 
they will not be detected (and will, in fact, be turned into truth.) 

A. 1-BIT VOTER 

The CFTP is designed to be fault tolerant by software. Its circuit needs to be able 
to detect an error and correct the error by itself. In order to achieve that, the concept of a 
voter is generated. 

The function of a 1-bit voter has been introduced in Lashomb’s thesis [1]. This 
section reviews the basic concepts and then starts constructing the TMR Assembly. Fig¬ 
ure 19 shows what a 1-bit voter looks like. It is a simple circuit consisting of only AND 
and OR gates. 



AND2 


Figure 19. 1-Bit Majority Voter (After Ref. [1].) 
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The voter function is more obvious in the truth table shown in Table 14. This 
voter always selects the majority of identical bits as its output bit. If two or more inputs 
are incorrect, the voter output will also be incorrect. The ability to detect and correct two 
or more errors in a voter is not vital for a system (e.g., the CFTP) in LEO orbit. 


A 

B 

c 

Y 

0 

0 

0 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

1 

0 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 


Table 14. Truth Table of A 


-Bit Voter (From Ref. [1].) 


Assuming a single error, the output is always correct, but we cannot tell if there 
has been an error just by looking at this output. Therefore, some extra gates are added to 
report the occurrence of an error. Figure 20 shows a voter with error detection and Table 
15 is its truth table. 



NOR3 


Figure 20. Voter with Error Detection (After Ref. [ 1 ].) 
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A 

B 

c 

Y 

ERR 

0 

0 

0 

0 

0 

0 

0 

1 

0 

1 

0 

1 

0 

0 

1 

0 

1 

1 

1 

1 

1 

0 

0 

0 

1 

1 

0 

1 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

1 

0 


Table 15. Truth Table of Voter with Error Detection (From Ref. [1].) 

The error detection, ERR , is 1 when one of the inputs is not identical with the rest. 
When the CFTP is in space, it is possible to have an SEU on the voter itself. A bit flip 
may cause the voter output to be incorrect. Say the second column of Table 15 has a bit 
flipping on A. This flipping makes 1 become the majority bit and output Y will give a 1 
not a 0. Since a voter is used to catch and correct an error, it is not pleasant if it has an 
error itself. Thus, some reliability is needed for the voter. A voter with added reliability 
is shown in Figure 21. 


A- 

HV 


OR3 


Figure 21. Voter with Added Reliability (After Ref. [1].) 


This version is built by duplicating the original part of the voter and XORing the 
two parts to generate a voter error detection, V_ERR. If the voter errors, the outputs of 
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the two OR3 in Figure 21 will not agree with each other, and V_ERR becomes 1. Table 
16 is the truth table of this circuit. 


A 

B 

c 

Y 

V ERR 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

1 

1 

0 

1 

0 

0 

0 

0 

1 

0 

1 

1 

0 

1 

1 

0 

1 

0 

1 

1 

1 

1 

0 


Table 16. Truth Table of Voter with Added reliability (From Ref. [1].) 


The last step is to collect all of these pieces to construct a complete single-bit 
voter. As introduced earlier, a voter with error detection is able to correct the error and 
tell the user an error has occurred. For the TMR design, knowing the existence of an er¬ 
ror is not good enough since the error also has to be corrected. In order to correct the er¬ 
ror, the faulty input may needs to be identified. With all these considerations, a complete 
circuit is generated as shown in Figure 22. The truth table for this circuit is Table 17. 



Figure 22. Complete Majority Voter (After Ref. [1].) 
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New signals CIDO and CIDl are used to identify the faulty input, with CIDO 
representing the least significant bit. Using the third row of the table as an example, the 
voter should be able to capture the error and identify the faulty input pin. The output sig¬ 
nal Y is a 0 and D ERR, error detection, reports a 1. This indicates that one of input sig¬ 
nals is not consistent and the correct input signal is 0. Furthermore, CID l and CID O 
show 1 and 0, respectively, which means the second processor is faulty. Since Y is 0 and 
the second input is faulty, it can be concluded that input B has an error and its value is 1. 

The schematic of the complete majority voter built in ISE is shown in Figure 23. 
All input and output pins are 1-bit wide. 


A 

Y 


V_ERR 

B 

CID_0 


Cl DJI 

C 

D ERR 




Figure 23. Schematic Symbol of 1-Bit Majority Voter 

B. 16-BIT VOTER 

Since KDLX has 16-bit output buses, 16-bit voters are needed in order to vote 
every bit on these buses. A 16-bit voter is simply composed of sixteen 1-bit voters as 
shown in Figure 24. All voters vote in parallel and produce five output buses for five dif¬ 
ferent signals, V, V_ERR, CID O, CID l, and DERR. 
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Sixteen 1-Bit Voters 



















Figure 25 is the schematic symbol used in ISE. The signal name D ERR is 
changed to ERR in order to simplify the notation. 



Y(15:0) 


— 

A(15:0) 



V_ERR(15:0) 



B(15:0) CID_0(15:0) 



CID_1 (15:0) 



C( 15:0) 



ERR(15:0) 



Figure 25. Schematic Symbol of 16-Bit Voter 

The voter perfonns an important role in TMR. It is the device to catch and report 
errors. The CFTP in space can have an SEU occur anywhere in the FPGA. If the error is 
caught by the voter, it will be corrected. If the voter votes incorrectly, it will be caught 
by the voter error detection circuitry. The problem becomes more complicated if an error 
occurs on the voter error detection. If the voter voted wrong but the error detection did 
not catch it, the error may propagate through the system and corrupt the data. A new cir¬ 
cuit can be added to detect error detection, but adding gates increases the probability of 
an error and also increases the complexity. Making a voter that has acceptable reliability 
without increasing the probability of an SEU too much is difficult. 

C. TMR ASSEMBLY WITHOUT MEMORY 

The concept of the TMR is to triplicate processors and vote all output signals to 
get correct values. An even number of processors cannot use majority voters. Five or 
more processors will increase the circuit size dramatically. As described earlier, this in¬ 
creases the probability of having an error by SEU. The usual compromise is to use three 
processors. The TMR does not increase circuitry too much and its efficiency has been 
proved in some existing space systems. 

In this section, several different architectures will be discussed, which is a good 
chance to show how things change when different components are used. Important learn¬ 
ing points will be provided at the end of this chapter. 
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1 . 


Schematic and Simulation 1 


Figure 26 is the first design of the TMR Assembly for this thesis. Important sig¬ 
nals are indicated with arrows. The three big blocks at the left side are KDLX proces¬ 
sors. The sequence from top to bottom is processor A, B and C. The 24-bit instruction 
input buses are instr_a(23:0), instr_b(23:0), and instr_c(23:0), respectively. 

Voters are connected at the outputs of the processors. All of the outputs are 
voted. The first three voters at the top are 1-bit voters for control signals and the other 
three are 16-bit voters for buses. The voter at the top is the voter for the program read 
signal. The read signals for the instruction fetch of all three processors are connected to 
this voter to be voted. The second one is the voter for data read signals and the third one 
is for data write signals. The three 16-bit voters are for the address, the program counter, 
and the data bus, respectively. 

The outputs of each voter are collected to a bus. Therefore, there are four buses 
on the right side. One data bus is at the output of the data voter, named data_p(15:0). 
Since each bus on the right side collects the outputs of six voters, each bus is 51-bits 
wide. 

Because the data memory used in the ISE has separate buses for the input and the 
output, data_p(15:0) is generated as a write bus and data_m(l5:0) is generated as a read 
bus. The read and write signals are active low. Thus, inverters are used to enable buff¬ 
ers. Without a buffer for isolation, data injected at data_m(15:0) will be voted and sent 
out to data_p(l 5:0) which may cause a bus conflict. 
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Processor A read_p 



Figure 26. TMR Assembly 
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This design so far provides everything needed for a TMR processor based on the 
theory described in section B. The next step was to put it on a simulation test bench and 
run it. The time constraints are 50 ns for clock high and low time and 10 ns for setup and 
hold time. Since only one clock is used in this simulation, the time constraints are trivial. 
The simulation results are shown in Figures 27 and 28. 
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Figure 27. TMR Assembly Simulation 1-1 


In Figure 27, the datam bus offers a series of data regardless of whether the 
instruction needs it or not. All instruction buses (i.e., instr a, instr_b and instr_c) have 
the same instruction at the same time. The first instruction, LW Rl<—Mem(R0+04), is 
fetched at point 1. It is not executed until point 2. Since the read signal goes low at point 
2, it is reasonable to say it loads data 005 Ai 6 . Signals cid_0, cid l and err all report zero 
because all instructions are consistent. Notice that the data on the data m. bus changes 
while read_p is still low. A clipping occurs at point 3. 

In Figure 28, another instruction, SW Rl—>Mem(R0+02), is fetched. Since R1 
had already fetched data at point 2, here we expected to see 005 Ai 6 on the data_p bus. 
Unfortunately this is not the case at point 5. The simulation tells us that KDLX has the 
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read signal active low, but it actually reads data at the rising edge. In this simulation, it 
read 0061 i6 at point 3 not 005 Ai 6 , as desired. 
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Figure 28. TMR Assembly Simulation 1-2 


Since the processor reads at the rising edge, the circuit must be able to keep the 
data stable to that point. The simulation in Figure 27 shows that 005 Ai 6 stays for most of 
the duration while read_p is low. However, the bus changes to 0061 16 at the last instant, 
which is not a desirable situation. Thus the next step is to modify the circuit to make the 
data stable through the rising edge of read_p. Figure 29 is the modified design. 

2. Schematic and Simulation 2 

A 16-bit latch is added to keep the input data stable. With this latch, the input 
data only changes when the read signal changes which should in theory, provide a perfect 
timing match. Simulations of this modified TMR Assembly are shown in Figures 30 and 
31. 
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Figure 29. Modified TMR Assembly 
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Figure 30. Modified TMR Assembly Simulation 2-1 

Points 6 and 7 in Figure 30 are identical to points 1 and 2 of Figure 27. The im¬ 
provement of the modified TMR Assembly appears at point 8. The latched data is still 
available at the point where read _p goes high and all three processors now read the value 
005 Ai 6 . The clipping at point 3 in Figure 27 disappears. 
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Figure 31. Modified TMR Assembly Simulation 2-2 
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Figure 31 continues the simulation to store the content of R1 to memory location 
02 16 at point 9. Following the signal write_p to point 10, one can find that the data on 
data_p is 005 Ai 6 . Signals cid_l, cid_0, err and v_err show that no error is reported. 

D. TMR ASSEMBLY WITH MEMORIES 

Since a working TMR Assembly has been generated, the final step is to hook it up 
with memories. The latch added in Figure 29 guarantee that the processors will read 
what they need to read. The schematic symbol of the TMR Assembly is shown in Figure 
32. The whole circuit is shown in Figure 33. 
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Figure 32. Schematic Symbol of the Modified TMR Assembly 


Many of the signals in Figure 33 are for the purpose of monitoring the simulation. 
As a convention, the memory at the left is the instruction memory and the one at the right 
is the data memory. Two buffers are used to control the data flow. Data flows into the 
data memory only when the write signal is low and flows to the TMRA only when the 
read signal is low. 

The instruction memory is pre-configured with the following Opcodes: 440301 1 6 , 
413406i6, and 450407i6. The first one will load data from memory location 01 16 to R3. 
The second one will add an immediate value 06 1 6 to R3 and save the result to R4. The fi¬ 
nal instruction will store the content of R4 to memory location 07 16 - Figure 34 shows the 
simulation result. 
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Figure 33. Modified TMR Assembly with Memories 
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Figure 34. Simulation of Modified TMR Assembly with Memories 
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Unfortunately, no error was reported but no data was sent out from the data mem¬ 
ory. If this design worked correctly, an output value 0009i6 should be seen when the 
TMRA writes to memory. Obviously, this did not happen when addr rom was 0Ei 6 . 
Since no timing mismatches occured anywhere, this design was hard to debug. The 
modified TMR Assembly works fine without memories, so the problem could have been 
the settings of this test bench. The time constraints of this test bench are listed in Table 
18. 


Processors 

Memories 

Clock High Time 

50 ns 

Clock High Time 

50 ns 

Clock Low Time 

50 ns 

Clock Low Time 

50 ns 

Input Setup Time 

10 ns 

Input Setup Time 

5 ns 

Output Valid Delay 

10 ns 

Output Valid Delay 

5 ns 

Time Offset 

0 ns 

Time Offset 

0 ns 

Table 18. Time Constraints of Test Bench for Modified TM1 

l Assembly 


Memories have less setup time and hold time, so they should be ready before the 
processors need their data. From this point of view, the test bench seemed not to be the 
problem. While the problem might have been incompatibility with the choice of mem¬ 
ory, the next alternative approach was to try the original TMR Assembly without the data 
latch as shown in Figure 26. Since all input and output signals are the same with this 
modified TMR Assembly, the schematic and complete design of the original TMR As¬ 
sembly are still identical to Figures 32 and 33, respectively. Using the same test bench 
and simulation as the first design produced the result shown in Figure 35. 

This version works. There is almost no timing mismatches and the data clippings 
are small enough to be ignored. This circuit sends out exactly the right value after the 
last instruction is executed. When addr_rom is at 0Ei6, 0009i6 is sent out from the TMRA 
to the data memory at the lower half clock cycle. The data as seen on outjnem has an¬ 
other half clock delay caused by memory. Signals cid_l, cid_0, err and v_err verily that 
no error is reported. 
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Figure 35. Simulation Result of First TMR Assembly with Memories 
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The final conclusion is that the latch added in Figure 29 does not help when the 
TMRA is connected with memories. The simulation results in Figures 30 and 31 worked 
because the input data was set manually. These manual changes set the error regardless 
of the changing of the read or write signals from the processors. Therefore, a latch was 
needed in this manual test bench. 

When the TMRA is connected with memories, the memories will interact with the 
write signal of the KDLX even though the detailed interaction among them are not visible 
in the test bench. A latch in the TMRA in this design will ruin the timing between the 
TMRA and the data memory. Thus, the simulation result in Figure 34 shows that the 
TMRA is totally unable to communicate with the data memory, while in Figure 36, with¬ 
out the latch the design works. 

E. TEST ON FAULT TOLERANCY OF TMR ASSEMBLY 

The concept of the TMR Assembly has been described and explained earlier in 
this chapter. The usage of the voters has been emphasized as well. Since the TMR As¬ 
sembly has been designed and simulated, the next requirement is to test the fault-tolerant 
ability. In order to provide errors, three instruction memories are necessary and more 
signals need to be monitored. 

1. Schematic and Simulation 

Figure 36 is a complete schematic with all of the components for the fault-tolerant 
testing. The concept is to change one of the instructions loaded into the TMRA and see if 
the voters can catch the error, correct it, and report it. Since the inconsistent instruction 
will lead one of the KDLX processors to do something different that the other two, voters 
should flag the inconsistency and point out the faulty processor, i.e., either cid_l or cid_0 
or both should not be zero. Some bits in the error detection bus, err, ought to be 1 when¬ 
ever any error exists. If all these signals work properly, the TMRA will be able to catch 
an error and trigger an interrupt routine. 

Three instruction memories, ROM A, ROM B and ROM C, are pre-configured 
with three different instruction maps. The data memory at the right side, RAM, has non- 
repeated value in its memory locations. This makes the data in the simulation more eas- 
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instr_passa 


ily identified since each memory address holds a unique value. Memory maps for the 
ROMs and RAM are displayed in Table 19. 
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Figure 36. Schematic for Fault-Tolerant Testing 
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RAM 

00 

20 

01 

21 

02 

22 

03 

23 

04 

24 

05 

25 

06 

26 

07 

27 

08 

28 

09 

29 

0A 

2A 

0B 

2B 

OC 

2C 

0D 

2D 

0E 

2E 


ROMA 

00 

000000 

01 

000000 

02 

000000 

03 

44010A 

04 

440203 

05 

44030C 

06 

44040D 

07 

000000 

08 

000000 

09 

000000 

0A 

000000 

0B 

450106 

OC 

450208 

0D 

450309 

0E 

450410 


ROMB 

00 

000000 

01 

000000 

02 

000000 

03 

44010A 

04 

44020B 

05 

440A0C 

06 

44040D 

07 

000000 

08 

000000 

09 

000000 

0A 

000000 

OB 

450103 

OC 

450207 

OD 

450309 

OE 

450410 


ROM C 

00 

000000 

01 

000000 

02 

000000 

03 

44010A 

04 

44020B 

05 

44030C 

06 

350911 

07 

000000 

08 

000000 

09 

000000 

OA 

000000 

OB 

450103 

OC 

450208 

OD 

450302 

OE 

450410 


Table 19. Instruction And Data Memory Maps 


The inconsistent instructions are grayed out in Table 19. The TMR Assembly 
simulation is shown in Figures 37, 38, and 39. 
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Figure 37. Simulation of Fault-Tolerant Testing 
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Figure 38. Simulation of Fault-Tolerant Testing (continued) 



Figure 39. Simulation of Fault-Tolerant Testing (continued) 
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In Figure 37, when the signal reset_p goes from low to high, the TMRA starts 
fetching instructions. Notice the signal outjnem shows 20i6 which is the first value at 
address 00i6. The instructions at address 03 16 of the ROMs are fetched at point 1. Fol¬ 
lowing that, three more instructions are fetched in sequence. The first instruction, 
44010 Ai 6 , is executed at point 2 in Figure 38 while addr_rom is 05 J6 and addrram is 
0 Ai 6 . The addrrom contains the address of the instruction being fetched, i.e., 05 1 6 . The 
addr ram contains the address that the first instruction, i.e., 44010Ai6, is using to access 
RAM. In this case, 0Ai6 is the correct address for this first instruction. 

From this point in the simulation, inconsistencies have been introduced in the in¬ 
struction memory. The bit distribution of the bus needs to be introduced in the next sec¬ 
tion before the simulation analysis is presented. 

2. Bit Distribution 

Recall the schematic in Figure 26. Four signals (i.e., V_ERR, CIDO, CIDl, and 
ERR) are collected into four different buses and each bus is 51-bit wide. Since one 51-bit 
bus consists of outputs from 6 different voters, each voter has a range in the bus distribu¬ 
tion. By looking at the bits in the distribution, one can tell which signal on which proces¬ 
sor is wrong. The bit distribution for CID l, CID O, and ERR is shown in Figure 40. 


CID 1(50:0) & CID 0(50:0) & ERR(50:0) Bit Distribution 

data(15:0) 

pc(15:0) 

addr int(15:0) 

wr 

rd 

prog rd 

50 35 

34 19 

18 3 

2 

1 

0 


Figure 40. Bit Distribution of CID l, CID O and ERR Buses 


In Figure 40, the bit distributions of all three buses are identical. For example, a 1 
at bit 20 of the ERR bus means that one of the KDLX processors has an error in its pro¬ 
gram counter. At the same time, bit 20 of the CID l and CID O buses will point out the 
faulty processor. 

3. Simulation Analysis 

The three instructions fetched by the TMRA at point 1 in Figure 37 are identical so 
no error is reported at point 2. Since there is no error in any one of the processors, the 
cid_l and cid_0 buses will not identify any processor. It was mentioned that the memory 
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needs a half clock cycle to send out data once it receives signals. That is why the first 
data is not on the outjnem bus until point 3. It can be verified that the TMRA is loading 
a correct value. 

When the instructions become inconsistent, the error detection signal is no longer 
zero. Meanwhile, the cidl and cid_0 locate the faulty processor. This can be checked 
from point 4 to 6. Figure 41 is the bit distribution of the error detection signals for the 
first Opcode, 44010Ai6. The hexadecimal number in the simulation is translated to a 
binary number when doing this data analysis. 
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0 
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error 

Figure 41. ERR Analysis for the First Opcode 


It is obvious that the sixth bit is inconsistent in three processors. In order to verify 
the error, the signals cid_l and cid_0 should be analyzed. Converting the hex numbers in 
the simulation to binary numbers and comparing the bit distribution with Figure 40 indi¬ 
cates that (Figure 42) the inconsistent bit is on the address bus and Processor A is the 
faulty processor. Recall from Table 17 that cid l is the most significant bit, so OI 2 stands 
for the first processor (i.e., Processor A). It is true that the instruction at address 01 16 in 
ROM A is the actual location of the error, but since this instruction is only sent to the first 
processor in the TMRA, Processor A is identified as faulty. 
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Figure 42. CID l and CID O Analysis for the First Opcode 



72 





















The reason that the error is at bit 6 is because that is the only location where the 
output bits are not consistent in the three processors. Figure 43 shows the situation. 



Ilex 

Binary 

Correct Address 

OB 

0000 1011 

Wrong Address 

03 

0000 0011 


t 


bit 3 

Figure 43. Address Comparison for the First Opcode 

The second Opcode in ROMB has an incorrect destination register. Since there 
are no output signals on KDLX for the destination register, point 4 in Figure 38 reports 
no error, even though this wrong Opcode loads a correct data into the wrong register. 

The contents of R3 are now inconsistent between the three processors as are the contents 
of RIO. This kind of error will only be found when the content of the faulty register is 
used. Point 9 in Figure 39 stores the contents of R3 to memory location 09 16 . It is kn own 
that the data in R3 is wrong in Processor B, but the Opcode difference at point 9 also 
means that the memory address of Processor C is wrong. Figure 44 shows the simulation 
result for point 13 in Figure 39. Six inconsistent bits were caught. 
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Figure 44. ERR Analysis at Point 13 


The contents of R3 in Processor B are zero, but in Processors A and C they are 
2Ci6- For cid l and cid_0, it is expected that the data portion in the bit distribution indi¬ 
cates that Processor B is wrong. Figure 45 shows the inconsistent bits between the cor¬ 
rect and wrong data. 
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bit 3 
bit 2 


Figure 45. Data comparison for R3 


The bit distribution of cid_l and cid_0 should put 002Ci6 in the data portion and 
indicate all inconsistencies caused by Processor B. Figure 46 illustrates that it does. 



Processor B 


Figure 46. CID l and CID O Data Portion Analysis at Point 13 


In addition, the address differences from Processor C at point 9 should also be in¬ 
dicated by cid_l and cid_0. This is shown in Figure 47. 
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Ilex 

Binary 

Address of A and B (correct) 

09 

0000 1001 

Address of C (wrong.) 

02 

0000 0010 


s-» 

inconsistent 

portion 


progrd 



^- s \ _ ) \ _ ) 

Hex 0.0 5 8 

Figure 47. CID l and CID O Address Portion Analysis at Point 13 

Notice that both cidl and cid_0 at point 13 have hex number 58. The inconsis¬ 
tent bits of the addresses are reflected correctly in the bit distribution. The Processor C is 
identified as the faulty one that gives a different address to the voter than the others. This 
proves that cid_l, cid_0 and err signals can deal with these kinds of multiple errors and 
still report flawlessly. 

Following the same procedure to analyze data on buses, one should be able to re¬ 
alize how the voter works and the way to utilize these signals for an interrupt routine. 

The rest of the simulation also performs correctly. The Opcode at address 06 16 of ROM 
C is a disaster since there is no such instruction. Based on the experience just learned, 
this kind of error will still be corrected. The inconsistency of register contents will be 
corrected the next time they are used and the wrong addresses will not affect anything as 
long as the other two addresses are correct. Correct data will still be fetched at point 7 in 
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Figure 38. The memory output data bus switches back to 0020 i 6 at point 8. Next, three 
store instructions are fetched in series. The first data written to memory shows up at 
point 10. Simple address inconsistencies at point 11 and 12 are easily analyzed. Errors at 
point 14 are detected, even though all three Opcodes, 450410 i 6 , are the same. That is be¬ 
cause the data loaded into R4 earlier was different and the error occurs only when R4 is 
routed to the output. 

F. IMPORTANT SIMULATION CONCEPTS REVIEW 

Simulation results are used a lot in this chapter to explain the operation of the 
TMR. Fundamental ideas on how to construct a test bench and how to analyze results 
have been established. Due to the different properties of the different components, a de¬ 
sign may not work when additional components are connected. Generating a good test 
bench is not easy since most timing problems are unpredictable. Some important knowl¬ 
edge for simulation needs to be introduced in order to help shrink the time for invention. 

1. KDLX Was Designed to Work with Asynchronous Memory 

In a personal conversation with Dr. Kenny Clark, I learned that the KDLX was 
designed for an asynchronous memory. Although it will work with a synchronous in¬ 
struction memory, an asynchronous memory is recommended since one should assume 
that the instruction memory and the data memory are in the same physical memory. Al¬ 
ways provide some different time constraints between KDLX and memories when gener¬ 
ating a test bench. 

2. Start with A Simple Test Bench First 

Trying to test everything on a new design is a bad idea. Too many signals need to 
be tracked and multiple errors are hard to debug. It is a good idea to start with a simple 
test bench which only tests a small part of the design. Revise the test bench to become 
more complicated step by step. It is also good to individually test every component gen¬ 
erated before constructing a top-level design. 

3. Test Bench Is Optimized for the Current Design 

As introduced earlier, the simulations have different time constraints. A test 
bench is used to check to see if a design works under reasonable assumptions. Circuits 
will be modified many times until the full design is complete. It is hard to specify the 
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requirement for a test bench before a circuit is actually built, so it is almost impossible to 
have an ideal test bench for a full design and every single component. In addition, a test 
bench that works on the top-level design may not fit to a single component. Timing 
mismatches always change with different wiring. 

4. Keep Old Designs 

It was shown in the TMR Assembly schematic that sometimes an old design is the 
real useful one. Incorrect settings for a test bench can mislead a designer to make a 
wrong decision and a modified design can become useless when other components are 
connected. Features on different components sometimes will balance out timing mis¬ 
matches between them. Going over previous designs helps a designer to retrieve original 
thoughts and keeping those files available is important. 

5. Working on the Copy of Source 

Based on personal experience, it is good to add a copy of a tested circuit into a 
large design rather than adding the original. This not only keeps the integrity of the 
original file but also makes it easy to review. Without making a copy, the new design 
will associate with the original design. Any modification in the new design directly af¬ 
fects the original file. Therefore, it will be impossible to keep the original source file. 

Keeping the integrity of each circuit is also important. People always want to see 
and test the fundamental design before they jump into the full design. For example, a 
new designer may want to understand voters before realizing the TMR Assembly. Mak¬ 
ing all correct and incorrect circuits into one project is convenient for a designer, but this 
does not help other people to understand. By the way, having all sources in one project 
lacks independency while doing individual tests. 

There is no question that making a copy of a source file definitely increases the 
size of folder and requires more time to manage individual projects. The big benefit of 
this is that a designer can always have original designs in hand as well as all projects left 
are tested and ready to go. A new designer thus has a chance to see the function of a 
voter before sinking into the confusion of the complete TMR Assembly. Since another 
new project will be generated once a project has failed, a design like the TMR Assembly 
may have different versions. The useful version contains only useful schematics and test 
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benches. From this point of view, all projects left are not only useful but also have few or 
no junk sources inside. 

Since hard drive space nowadays is huge and cheap, working on a copy fde not 
only gives people a chance to review but also make all projects look clean and easy to 
understand. 

G. CHAPTER SUMMARY 

This chapter introduced the kernel of the full TMR design, i.e., the TMR Assem¬ 
bly. Understanding how voters catch errors and how to analyze simulation results is the 
main point in this chapter. Many explanations of simulation results are provided in order 
to help one realize the spirit of the TMR design. After reading so many simulations, one 
should have a feeling on how to use and generate a test bench. A quick review on simu¬ 
lation concepts is put at the end of this chapter after one has studied some simulations and 
before he/she jumps into a more complex design. 

Other components associated with the TMR Assembly like the Reconciler, Inter¬ 
rupt and Error Syndrome Storage Device (ESSD) will be explained in following chapters. 
The Reconciler is an interface between KDLX and memory; the Interrupt is the one gen¬ 
erating ISR; the ESSD is responsible for storing error syndromes whenever an error oc¬ 
curs. 
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VI. RECONCILER 


Due to the different memory architectures between KDLX and CFTP as described 
in Chapter IV, the Reconciler is used to satisfy the timing requirements on both sides and 
properly route the data. Since KDLX can only access memory via load and store instruc¬ 
tions, the Reconciler only needs to monitor the read and write signals from KDLX and di¬ 
rect the data to the correct destinations. 


In this chapter, no error detection or correction will be discussed since the Recon¬ 
ciler is not responsible for this. The TMR Assembly is responsible for error detection. 
Error correction is done by the Interrupt and the voters in the TMR Assembly. Storing 
the error syndromes is the job of the Error Syndrome Storage Device (ESSD). 

A. CONSTRUCTION AND FUNCTION 


Only one physical memory is available in the CFTP. In order to make this one 
memory act as the both instruction memory and data memory in each KDLX clock cycle, 
the physical memory has to run at twice the speed of KDLX. For the same reason the 
Reconciler has also to run twice as fast as KDLX. For each KDLX clock cycle, one ad¬ 
dress bus access and one data bus access for instructions needs to be available. Mean¬ 
while, one address bus and one data bus access for data also needs to be available. To 
fetch an instruction and do a data read or write, the Reconciler has to act as an instruction 
memory in the first half of the KDLX clock cycle and act as a data memory in the second 
half of the KDLX clock cycle. This function is illustrated in Figure 48. 


KDLX clock 


pc(15:0) is available 


KDLX 

signals 


Memory or 
Reconciler 
clock 



instr(23:0) is available 
addr_int(15:0) is available 
data(15:0) is available 


> 


Instruction fetch Data read or write 


Figure 48. Illustration of Reconciler Function 
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The Reconciler is composed of a state machine coded in VHDL and is presented 
completely in Appendix C, section A. The state machine contains five states: one starting 
point, two for normal operations, one for read, and one for write. This function can be 
seen clearly in Figure 49. 


State 1 



The name of the state is on the top of each circle except for the initial state named 
State. The number in each state is the state number designed for tracking purposes in the 
simulation. The two normal operations, Stated and Statel, are identical and are for fetch¬ 
ing instruction. Without reading or writing, these two states just pass the program 
counter to memory, fetch the instruction and send it back to the KDLX. At this time, the 
memory acts as a ROM and its data-input bus is in a high impedance state. Since only 
the instruction bus is used, the data bus of the KDXL is also in a high impedance state. 
State Statel is a duplication of StateO so the state machine can be revised to stay at Stated 
when neither rd_r nor wr_r is 0. The reason for using two states is to provide tracking in 
simulation. Since the Reconciler runs twice as fast as the KDLX, reading and writing ac¬ 
tions only occur at Stated. Without the separation into two states, it is hard to tell if a 
read or write occurs at the proper state. 
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When rdr is 0 and wr_r is 1, the state machine goes to the Read State. KDLX 
wants to read data from memory so the Reconciler will pass a high write signal to the 
memory and direct data from the memory to KDLX. When rd_r is 1 and wr_r is 0, the 
Reconciler knows that KDLX wants to write data to the memory, so it passes a low write 
signal to memory and directs data from KDLX to memory. 

Lhe initial state, State, is not used until the next reset. It is null and there are no 
actions in this state. Without this state, the state machine would use Stated as the initial 
state and start at State 1 after reset. 

B. SCHEMATIC AND SIMULATION OF RECONCILER ONLY 

Converting a VHDL code to a schematic symbol is a useful function in the ISE 
software. The schematic symbol of Reconciler is shown in Figure 50. 
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Figure 50. Schematic Symbol of Reconciler 


Simulation of the Reconciler itself is quite simple. Since it is basically a state 
machine, a state will either stay at current state or jump to a new state every clock cycle. 
Figure 51 is the simulation result. 
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Figure 51. Simulation Result of the Reconciler 


81 
















































The signal at the bottom in Figure 51 is the state number used to track which state 
is active. The state machine starts at Stated after reset. The signal addroutr is the bus 
connected with the memory address bus. It sends out either pc_r or addrinj• depending 
on whether the system is doing an instruction fetch or a data read/write. In StateO and 
State 1, the addrout r is always the same value as pe r. The memory data output bus 
connects with the signal datain_r on Reconciler and sends out either an instruction or a 
data value. When rd_r is low, data on datin_r will be forwarded to memdata which 
connects to the data bus of KDLX. When wr_r is low, the state machine goes to the 
WriteState. At this state, data from KDLX is available on mem data and Reconciler will 
direct this data to dataout r which connects to the data input bus of memory. 

The instr_data is never in a high impedance state regardless of whether the data 
on datain r is an instruction or not. The reason is to make an instruction stay available 
until the next KDLX clock cycle. Even during ReadState and WriteState, the next in¬ 
struction for the KDLX is alive on the instruction bus. Remember that the Reconciler is 
twice as fast as the KDLX. If the next instruction is only available for the first half of the 
KDLX clock cycle, it will not be fetched at the rising edge of the next KDLX clock. This 
concept will be described again when the Reconciler is hooked-up with a KDLX proces¬ 
sor. 

C. SCHEMATIC AND SIMULATION OF RECONCILER WITH KDLX 

The last step for testing the Reconciler is to simulate it with a KDLX. The sche¬ 
matic of this part of the design is shown in Figure 52. 
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Figure 52. Schematic of Reconciler with KDLX and Memory 


The memory offered in the ISE software is not a real Von Neumann architecture. 
Instead of having one bi-directional data bus, the Reconciler is designed to have two 
separated buses for data, datain_r(23:0) and dataout_r(23:0). The mem_data(15:0) on 
Reconciler is bi-directional in order to transfer data back and forth with the KDLX. 
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The simulation for this circuit is done with a series of load and store instructions 
in order to see if the Reconciler can handle both instructions and data correctly. Figure 
53 is the first part of the simulation result. 



Figure 53. The First Part of the Simulation Result for Reconciler 

In Figure 53, the first instruction in memory is fetched at point 1 when pc _p was 
sent. It can be seen clearly from the status of state_r that the Reconciler is in double 
speed. At point 2, the Opcode 440140i6 is executed and wants to load data into Rl. At 
the same time, the KDLX is going to fetch the Opcode 440443 16 . The address of data for 
the first instruction is available at point 3 in this time interval. Therefore, the signal 
addrjn fetches pc_p at the first half of the KDLX clock cycle and fetches addr_p at the 
second half of the KDLX clock cycle. The data at memory location 0040 [6 thus is sent 
from memory to KDLX when state_r is 2. Notice that at this time interval Opcode 
440443 16 is available on the bus until the next KDLX clock. This is important since 
KDLX is triggered at the rising edge of the clock. Failure to keep an instruction until the 
next rising edge will mean that the KDLX will not be able to fetch this instruction and the 
memory location for data will not appear at point 4. This is why the instruction bus is not 
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set to a high impedance state at the ReadState and WriteState in the Reconciler. The rest 
of this simulation in Appendix A, section H does a series of writes followed by a series of 
reads in order to check if the Reconciler functions properly. 

D. TIMING CONCERNS 

An added complexity for this simulation is the fact that it has three different 
clocks. To make this simulation work, the time constraints of the test bench have to be 
set properly. The sequence of execution in this circuit is that the KDLX sends its pro¬ 
gram counter to the Reconciler first. Then Reconciler forwards this address to the mem¬ 
ory. Next, the memory selects the instruction and sends it to the Reconciler. Finally, the 
Reconciler forwards this instruction to the KDLX. This is a simple example of how 
KDLX fetches an instruction. 


In order to successfully fetch an instruction, the KDLX has to have its program 
counter ready before the Reconciler needs it. The Reconciler has to have the address set 
before the memory is ready to receive it. Considering setup time and hold time for each 
clock, the relationship among these three clocks is shown in Figure 54. 


KDLX 

Clock 


KDLX 



Setup Time Hold Time 


Reconciler 



Setup Time 





Setup Time 


Memory 



b-*b-H b--b-H b--b-H 

Setup Time Hold Time Setup Time Hold Time Setup Time Hold Time 


Figure 54. Timing Relationship Among Clocks 


It does not matter that the Reconciler and memory clocks are faster than KDLX 
since the KDLX has to be ready whenever the Reconciler needs data. In Figure 54, all 
three clocks are shown together as they were in the simulation for comparing timing re¬ 
quirements. Since the Reconciler has a hold time longer than KDLX, the KDLX will be 
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ready before the Reconciler is ready. The Reconciler will be set before the memory 
needs input signals. 

When KDLX is executing a read-data instruction, the memory will have the data 
available later than the KDLX starts to read. Therefore, a little clipping occurs every 
time that KDLX reads data. To minimize this clipping, the setup and hold time between 
the three clocks have to be as close as possible. 

In this simulation, if any two clocks have identical setup and hold time, the testing 
will fail. Since the Reconciler is a state machine, the current state will jump to a different 
state if the conditional requirements are not met in time. This causes the KDLX to fail to 
interact with the memory; therefore the following instructions will not be fetched. 

E. CHAPTER SUMMARY 

This chapter introduced the function of the Reconciler in the TMR design. This 
component is designed to consolidate two different architectures in a circuit and is not di¬ 
rectly associated with error detection or correction in the TMR. This is the first time in 
this thesis that time constraints were discussed in detail since there are specific timing re¬ 
quirements for the Reconciler. The concept of establishing the setup time and hold time 
for a test bench is more important after this chapter because more components are in¬ 
volved in the TMR design. 

Another component (called Interrupt ) is discussed in the next chapter. This com¬ 
ponent leads the TMR design to the Interrupt Service Routine (ISR) when an error oc¬ 
curs. How to intercept the current execution of the KDLX to start an ISR and how it 
works with other components in the TMR design will be described as well. 


86 



VII. INTERRUPT 


The TMR Assembly, consisting of processors and voters, is able to detect an error 
and correct it. Even though voters are able to correct errors as they come out the system, 
whichever of the KDLX processors that caused the error will still have the wrong data in¬ 
side. If an error in one processor is not corrected in time, another error occurring in an¬ 
other processor may not be detected by voters. As was described earlier in Chapter V, a 
majority voter is not able to handle multiple identical errors. 

In order to correct an error in the KDLX, the nonnal operation has to be stopped 
and all contents of registers in the three processors have to be voted. The voters will cor¬ 
rect any inconsistency between the three processors in this process while storing all cor¬ 
rect data into memory and then reloading them back into the original registers. Once this 
procedure is done, all contents of registers are identical between the three processors. 

The Interrupt is the circuit used to stop normal operation and switch the circuit to do this 
error correction. 

A. CONSTRUCTION AND FUNCTION 

The Interrupt is also a state machine coded in VHDL. The state machine is 
shown in Figure 55. The concept is to have it look for the error detection signal from the 
TMR Assembly. If an error occurs, it will latch the current program counter and send out 
a TRAP instruction to processors. Two NOPs follow the TRAP instruction in order to 
clean the pipeline of the processors. Only two NOPs are needed because the TRAP in¬ 
struction will start to be executed right after the second NOP. Any instruction after the 
second NOP will either be useless or mask out instructions that the TRAP wants to fetch. 
After the second NOP, the TMR Assembly is in the ISR and the Interrupt waits for an 
RFE instruction from memory, placed to mark the end of the ISR. 

When the processors receive the TRAP instruction sent from Interrupt, they jump 
to a specific memory location and start the ISR for storing and reloading the contents of 
all of the registers. The last instruction in the ISR is the RFE instruction. When memory 
sends out this instruction, it will be seen by the Interrupt and the Interrupt will replace 
the RFE instruction with a new Jump instruction. This new Jump instruction is con- 
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structed by the Interrupt from the Opcode C8 i 6 plus the latched program counter to force 
the processors to jump back to where the trap occurred. 


err = 1 



no RFE 


Figure 55. State Machine of Interrupt 

Recall the function of TRAP and RFE instructions in Table 13. The reason to re¬ 
place the RFE instruction with a Jump instruction is because the RFE instruction does not 
jump back to where the TRAP instruction occurs. It is known that the RFE will jump to 
the address stored in the IAR which is two clock cycles later than when the TRAP oc¬ 
curred. The choice was between revising a tested version of KDLX and building a sepa¬ 
rate circuit to be able to generate a new Jump instruction. The separate circuit is easier to 
achieve for this Interrupt since it is a state machine and is coded in VHDL. First, a state 
machine can do several different things in one clock cycle. Because the new Jump in¬ 
struction is not needed until the BackState, two NOP clock cycles are sufficient for gen¬ 
erating an instruction. Second, data on different buses can be more easily combined in 
VHDL than other methods, e.g., schematics. 

The Reconciler discussed in the previous chapter only allows an instruction to be 
fetched in the first half of the KDLX clock cycle, but the state machine shown in Figure 
55 works with a KDLX at the same speed. In order to interrupt and insert instructions at 
the correct timing, the Interrupt has to match the speed of the Reconciler. Doubling the 
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speed of the Interrupt is not the same as that of the Reconciler since the Interrupt has 
several different states in series. The methodology here is to duplicate each state, which 
makes the state machine twice as long. The new state machine is shown in Figure 56 and 
its VHDL code is in Appendix C, section B. 


reset i = 0 



NopStateOA 


NopStateOB 


NopState1_A 


Figure 56. New State Machine of Interrupt 


The first two states, StateO_A and State!)_B, do not need to be duplicated in spite 
of the even number of states. The state machine is also revised so that only Stated_B can 
go to TrapState_A. In spite of double speed, StateO_A still needs to go to StateO_B even 
if an error occurs at StateO_A. On the other hand, the KDLX reads and writes data at the 
falling edge of clock, which means that a data error always occurs at StateO_B. After 
NopStatel_B, the TMR design starts the ISR and the WaitState_B waits for the RFE in¬ 
struction. Once the RFE instruction is sent out from memory, the Interrupt takes over the 
instruction bus again and injects the new Jump instruction at the BackState_A. The TMR 
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design goes back to normal operation when the new Jump instruction is executed by the 
processors. 

B. SCHEMATIC 

The functions of Interrupt can be easily understood from the simulation result 
shown in Appendix A, section I. The simulation for the Interrupt only is not explained 
here since the state _i indicates active states in Figure 56 explicitly. Figure 57 is the 
schematic symbol of Interrupt. 


interrupt 


clk_i 

sel_i(23:0) 

reset i 


pc_out(15:0) 

err 

trap_i(23:0) 

rfe_i(23:0) 

pc_in(15:0) 

state_i(3:0) 


Figure 57. Schematic Symbol of Interrupt 


The input signal err is used to monitor the occurrence of an error. When this sig¬ 
nal goes high, the ISR starts. Once the ISR is triggered, the program counter where the 
error occurs is sent to pc_in(15:0) where it will be latched and this latched program 
counter will be output instantly at pc_out(15:0). The Interrupt uses signal sel_i(23:0) to 
switch a mux and sends out the TRAP instruction via trap_i(23:0). After that, sel_i(23:0) 
switches the mux back to nonnal and the input signal rfe_i(23:0) starts monitoring the 
Opcodes passing through on the instruction bus. When the RFE instruction is sent out 
from memory, sel_i(23:0) actives again and trapj(23:0) sends out the new Jump instruc¬ 
tion. Consequently, the TMR design is back to its normal operation. Figure 58 is the de¬ 
sign of the Interrupt with a processor and two memories. 


90 




Figure 58. Schematic of the Interrupt with KDLX and Memories 
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The mux located between instruction memory and KDLX is used for Interrupt to 
inject the TRAP instruction. Normally, the KDLX fetches instructions from the instruc¬ 
tion memory and the mux allows this traffic to pass. When an error occurs, the mux con¬ 
trolled by Interrupt immediately switches to the other bus and a TRAP instruction gener¬ 
ated by the Interrupt will be sent to the KDLX. The original instruction at this time is 
blocked on the bus and the KDLX receives the TRAP instruction instead. The Opcode 
for the TRAP instruction in this thesis is 280030 16 which uses memory location 0030 16 as 
the starting point of the ISR. This value can be easily changed in Interrupt ''s VHDL 
code. The basic idea is not to have the ISR address too close to the address of normal op¬ 
erations in memory to keep it from being overwriten. Simulations in this thesis are care¬ 
fully designed and small address spaces let people see the complete implementation in 
memories. 

C. SIMULATION 

Table 20 shows the contents of the memories and the registers before and after the 
simulation. 


Data Mem 

00 


01 

0044 

02 

0045 

03 

0046 

04 

0047 

05 

0048 

06 

0049 

07 

004A 

08 

004B 

09 

004C 

0A 


0B 


OC 


0D 


0E 


OF 


10 

0044 

11 

0045 

12 

0046 

13 

0047 

14 

0048 

15 

0049 

16 

004A 

17 

004B 

18 

004C 

19 



Register 

00 


01 

0044 

02 

0045 

03 

0046 

04 

0047 

05 

0048 

06 

0049 

07 

004A 

08 

004B 

09 

004C 

10 

0055 

11 

0066 

12 

0077 

13 


14 


15 



Instruction Mem 

00 


2D 


01 


2E 


02 

440101 

2F 


03 

440202 

30 

000000 

04 

440303 

31 

000000 

05 

440404 

32 

000000 

06 

440505 

33 

450420 

07 

440606 

34 

450520 

08 

440707 

35 

450620 

09 

440808 

36 

450720 

0A 

440909 

37 

411A11 

0B 

450110 

38 

411B22 

OC 

450211 

39 

411C33 

0D 

450312 

3A 

000000 

0E 

450413 

3B 

000000 

OF 

450514 

3C 

000000 

10 

450615 

3D 

F80000 

11 

450716 

3E 

000000 

12 

450817 

3F 

000000 

13 

450918 

40 

000000 

14 

450A19 

41 


15 

450B1A 

42 


16 

450C1B 

43 




44 


45 


2C 


46 



Table 20. Tables of Registers and Memories in Simulation 
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Part of the complete simulation is shown in Figures 59 and 60. An error is seen at 
point 1 and the instruction at point 2 is intercepted by the Interrupt. It can be seen clearly 
that the value of signal sel_i changes and a TRAP instruction followed by two NOPs are 
injected at point 3. 



One important thing here is that the time an error is seen is not the time an error 
occurs. The reason is because the KDLX is pipelined and the memory stage is the fourth 
pipeline stage. Including the time for the Interrupt to respond, the total delay from the 
instruction causing the error is four KDLX clock cycles. This feature cannot be seen in 
this simulation because the error was set manually. 

The program counter latched by the Interrupt at point 3 is 0008 16 in this simula¬ 
tion. The instruction intercepted is 440606i6 which is at address 07i6 in Table 20. The 
concept is to jump back to where the TRAP was inserted. Theoretically, the program 
counter latched should be 0007i6 not 0008 i 6 . Because of the change of the pc _p at point 
3 and the instruction delay from memory, the latched program counter is a wrong value. 
Another possible reason is since this error is generated from the test bench not from the 


93 


























































circuit itself, the timing for the occurrence of an error could be in the wrong place. This 
issue will be discussed again and resolved in Chapter VIII when the full design without 
ESSD is presented. 

The TRAP instruction inserted at point 3 affects the circuit at point 5. Opcodes 
from instruction memory address 30 i 6 to 40 j 6 are the ISR. Instructions in the ISR can be 
related or unrelated to the original commands, but the purpose is to correct the error. 
Since there is no actual error in this simulation, the ISR is designed just to do something 
else. The full function of the real ISR is to store all contents of the registers to memory 
and reload these contents back to registers. The ISR in this simulation is incomplete. 



Figure 60. Partial Simulation Result of Interrupt with KDLX (continued) 


Storing the contents of R4 to R7, the simulation shows R6 and R7 at point 6 are 
not loaded with any value. This proves that the Interrupt can successfully insert the 
TRAP instruction. At point 7, the RFE instruction (i.e., F80000i6) is detected by the In¬ 
terrupt. Instantly, sel_i switches to zero and trapj sends out the new Jump instruction, 
C80005i6. As described earlier, the new Jump instruction is formed from (C8i6+latched 
program counter). Therefore, the Opcode C80005i6 is generated and executed at point 8 
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The rest of simulation in Appendix A, section J checks the contents of registers to verify 
the operation. 

D. CHAPTER SUMMARY 

The functions of the Interrupt were described and simulated in this chapter. 

When an error occurs, the Interrupt should lead the TMR design to do error correction 
and also be able to bring the circuit back to its normal operation. The purpose is to cor¬ 
rect an error as soon as possible after it occurs. Thus the error will not be propagated 
making the circuit lose control. 

The first design of the Interrupt was to replace instructions in memory in order to 
implement the ISR. This could not be done in this design because a ROM is used as the 
instruction memory. Since the real CFTP design uses only one RAM, the instruction set 
could be changed in memory. However, changing original instructions is the last thing 
people want to do because it may cause an unrecoverable error. 

In the next chapter, the full design without ESSD will be introduced. The usage 
of the ISR will be described clearly and the interactions between Interrupt and Reconciler 
will be expressed as well. The simulation of the full design should clarify any confu¬ 
sions among the different components. 
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VIII. THE FULL DESIGN WITHOUT ESSD 


The full design in this chapter consolidates the TMRA from Chapter V, the Recon¬ 
ciler from Chapter VI and the Interrupt from Chapter VII. The TMRA contains three 
KDLX processors and six voters. All outputs of the processors are voted and any error 
will be corrected. The Reconciler is responsible for integrating the Harvard and Von 
Neumann architectures. It runs in double speed in order to act as an instruction memory 
in the first half of the KDLX clock and as a data memory in the second half of the KDLX 
clock. The component used to correct errors besides the voters is Interrupt. It intercepts 
normal operation of the TMRA when an error occurs, forces it to do an ISR and makes it 
jump back to normal operation after the error is corrected. The error signal for the Inter¬ 
rupt is given by the TMRA. For this design the voter is assumed to be error-free and the 
voter error detection signal is not used. 

Each component discussed earlier has been simulated to prove its function with or 
without the KDLX and memories. Simulating all these components together in a circuit 
should be able to catch and correct an error. This is the goal for the full design and its 
function will be proved in this chapter. 

A. SCHEMATIC 

The TMRA itself basically connects with the memories as just one KDLX would. 
Most input and output buses are the same except the number of signals increases or de¬ 
creases. The Reconciler sitting between the TMRA and the memory has to receive all 
output signals that the original KDLX has, except the program read signal, i.e., the read 
and write signals, the program counter, the address for data, and the data bus. The Inter¬ 
rupt needs the error signal to trigger the ISR, the program counter to generate a new 
Jump instruction, and instructions for doing TRAP, RFE and Jump. 

In order to test the circuit, several buses and memory have to be triplicated. The 
way to test the error handling of the system is to program an inconsistency into one of the 
three memories and expect that the circuit can catch the error and correct it. Without this 
artifice, the Interrupt will never work and the ISR will never be triggered. The alternate 
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Figure 61. The Full Design 
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In Figure 61, only the Interrupt is unchanged since it does not have any data bus 
connections. Three RAM s are used, and a bus connects each to one of the processors. 
Therefore, both Reconciler and TMRA have more buses than before. The three muxes at 
the bottom left are used to intercept the TRAP and Jump instructions. The box at the top 
left (called or51toT) is coded by VHDL and ORs 51 bits from ERR(50:0) into 1 bit. Any 
error that occurs at any output signal of the KDLX will trigger the ISR. The revised 
VHDL code for Reconciler is in Appendix C, section C. 

Because the Interrupt must monitor a memory bus in order to detect the RFE for 
testing, one of the memories must always be correct. This design chooses RAM A as the 
monitored RAM; therefore its contents are always correct. 

B. SIMULATION 

The three RAM s are pre-conligurcd as shown in Figure 62. In order to express the 
concept of the TMR and keep the simulation simple, only the data at memory location 
4Ci6 is different for RAMB. The ISR is designed to start at address 30 i 6 and end at 3Ci6. 
What the ISR does is to store contents of registers to memory, relying on the voters to en¬ 
sure that the correct contents are written into memory. (In the real circuit, the ISR then 
restores all registers from these correct values in memory.) The Opcode F 8 OOOO 16 is the 
RFE instruction used to tell Interrupt where the end of the ISR is. Instructions from ad¬ 
dress OA 16 to 10 i 6 are used to check data in registers. 
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RAM A, B and C 


00 

000000 

2D 


01 

000000 

2E 


02 

44014A 

2F 


03 

44024B 

30 

45014A 

04 

44034C 

31 

45024B 

05 

44044D 

32 

45034C 

06 

44054E 

33 

45044D 

07 

44064E 

34 

45054E 

08 

000000 

35 

45064F 

09 

000000 

36 

000000 

0A 

000000 

37 

000000 

0B 

44014A 

38 

000000 

OC 

44024B 

39 

F80000 

0D 

44034C 

3A 

000000 

0E 

44044D 

3B 

000000 

OF 

44054E 

3C 

000000 

10 

44064E 

3D 


11 

000000 

3E 


12 

000000 



13 

000000 

14 

000000 



4A 

0000AA 

4B 

0000BB 

4C 

oooocc 

4D 

0000DD 

4E 

0000EE 

4F 

0000FF 

2C 


50 



RAM B has 00011 


Figure 62. Memory Pre-configurations 


Figures 63, 65, and 66 display the full simulation result and some trivial signals 
are not shown. There are four clocks in this design. Clock signals elk_p, clk_i, clk_r, and 
clkjn are for the KDLXs, Interrupt, Reconciler, and RAMs, respectively. The KDLX 
clock runs at one-half the speed of the others. Since the Interrupt does not need signals 
from the Reconciler and vice versa, these two components are running at the same clock 
speed. The RAMs are looking for the outputs of the Reconciler so the memory clock has 
the longest setup and hold time. 
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The KDLXs, Interrupt and Reconciler are reset at point 1 and only rest_p for 
processors is shown. When the program counter, pc_p, is 0002 1 the first instruction is 
fetched. It is known that the instruction at point 2 should cause an error because the data 
at address 4Ci6 is not consistent between RAMs. Tracing the simulation to point 3, the 
function of the Reconciler is shown clearly here. Half of the KDLX clock cycle is fetch¬ 
ing the instruction at the corresponding program counter and the other half cycle is read¬ 
ing data from the memory for the first instruction. So the Reconciler actually reads the 
instruction at memory address 0005 16 first and then reads the data at address 004 Ai 6 . 

This feature makes it possible to consolidate the two different architectures. As discussed 
earlier, the instructions should be held until the next rising edge of the KDLX clock. 

Thus the Reconciler should not block any data or make a bus high impedance on instr ra, 
instr rb, and instr_rc. 
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Instructions at point 2 are executed one KDLX clock cycle after point 3. The data 
needed for these instructions is offered at point 4. The wrong data in RAM B is sent to R3 
of the second KDLX in the TMRA at this time. It is hard to see but cid_0 and cid l at 
point 5 do report errors. The main purpose for this simulation is to show how different 
components work together and realize the concept of the TMR. Therefore, the error re¬ 
ports will be analyzed later. 

Since the voters are hooked-up to the output buses of the KDLXs, it may be con¬ 
fusing that the TMRA reports an error while it is loading data not storing. If this error is 
not seen while loading, then the TMR will not be able to find it until the next time this er¬ 
ror is stored into memory. Figure 64 is only a part of the TMR Assembly in Figure 26 
and shows how input data flows. 



BUFE1G f 


Figure 64. Flowing Direction of the Input Data in TMRA 

The flowing direction of the input data to the KDLXs is expressed clearly in Fig¬ 
ure 64. Even though the buses on the voters are not bi-directional, the input data can still 
be voted by this scheme. Therefore, the TMR can check data either on loading or storing 
without waiting until the wrong data is used. 

Going back to point 4 in the simulation result. An error is caught by the voter so 
the errj, becomes high and triggers the ISR. At point 6, the signal sel_i switches to 
000000 16 which allows the Interrupt to insert one TRAP instruction and two NOPs to 
TMRA. Notice that the stateJ changes to 2 16 which is the TrapState of Interrupt. The 
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program counter latched is 0008 [6 so the TMR should jump back to this address when the 
ISR is done. At point 7, the TRAP instruction is executed by the KDLX and starts the 
ISR portion in Figure 62. 


/testbench/dk_p 
/testbench/dkj 
/testbench/dk_r 
/testbe nch/cl k_m 
Aestbench/reset_p 
/testbench/pc_p 
/testbe nch/addr_p 
/testbench/dd_l 
/testbench/dd_0 
/testbench/data_p 
/testbench/data_ra 



^45034C (<50*40 (qOOOAA JoOOQBB |<5064F 


H45034C I45044D 


I45054E I45064F 


I45034C ~l45044D 
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/testbench/data_rb 
Aestbench/data_rc 
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/testbench/err_i 
Aestbench/instr_ra 
/testbench/instr_rb 

/testbench/instr_rc K45034C (45044D 
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/testbench/trap_i 
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/testbench/write_p 
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/testbench/state_i 
/testbench/state_r 
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Figure 65. Simulation of the Full Design without ESSD (continued) 


The implementation in this ISR is to store all contents of registers to memory. All 
data in registers will be voted this time and any inconsistency should vanish. The wrong 
data in RAMB ought to be corrected after this implementation. Normally the ISR will 
not write to original data. The reason for doing this here is because this test is to prove 
the ability to correct an error. Thus the same error should not appear next time when the 
same instruction is executed. 

The contents of R3 shows up again at point 8 in the ISR. Any error detected 
while in the ISR will be ignored since this procedure is correcting an error and voters will 
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take care of other errors. The erri flags at point 8 will be ignored again because it is 
known that the data in R3 of the second processor is wrong. Signals cid_0 and cid_l at 
this point report the same error syndrome as the one at point 5. It could be explained eas¬ 
ily since data is the only thing having a problem. If the third Opcode for ISR is different 
in one of the processors, signals cid_0 and cid l at point 8 will have a different error 
syndrome. It could be seen that Interrupt stays at the WaitState until it sees the RFE in¬ 
struction. 

Once the Interrupt detects the RFE instruction sent out from the RAM A, it starts 
its BackState at point 10. The instruction buses of the Reconciler (i.e., instr_ra, instr_rb 
and instr_rc ) are forced to zero at point 9 when the RFE instruction is detected. The RFE 
instruction can never be passed to the TMRA or it will be fetched and executed at point 
12. If so, the new Jump instruction at point 10 becomes useless. 

The Interrupt inserts the new Jump instruction, C80008 i 6 , one clock after point 9. 
Therefore, it takes three clock cycles to have the new program counter used after 
F80000i6 is seen by Interrupt. The operation code from address 3Ai6 to 3Ci6 in Figure 62 
will not be implemented since the Reconciler wants to clean the pipeline before the TMR 
goes back to normal operation. So point 11 in the simulation is where the ISR stops. At 
this time, both Reconciler and Interrupt are already back to nonnal states. The TMR 
goes back to normal operation at point 12. 

Doing exactly the same instruction set again from address 08 16 to 10i6 in Figure 
62 proves the error in RAMB has been corrected. No error is reported and the ISR is not 
triggered again at point 13 in Figure 66. 

A complete ISR should store all contents of registers to memory and reload them 
back to the original registers. Inconsistent data between the three processors should van¬ 
ish. The ISR shown in Figure 62 is not complete in order to keep the simulation simple. 
Generally speaking, the ISR should not overwrite the original data. A temporary memory 
location needs to be specified for storing and reloading purposes in the ISR. The simula¬ 
tion in this design of overwriting the original data just proves the function of the error 
correction. 
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Figure 66. Simulation of the Full Design without ESSD (continued) 


C. ERROR ANALYSIS 

The analysis of the error in this simulation is quite easy since the data portion is 
the only part that needs to be checked. Figure 67 shows the way to check the error. 

At point 5 in the simulation, the cid l is 006E800000000i6 and the cid_0 is all 
zero. A zoom-in on point 5 is shown in Appendix A, section K. It can be quickly identi¬ 
fied as an error from the second processor. Comparing the inconsistent portion of the 
data with cid data shows that they have the same pattern which demenstrates that the er¬ 
ror report in this design is correct. 


105 































































Ilex 

13 mao 

Correct Data 

OOCC 

0000 0000 1100 1100 

Wrong Data 

001 I 

0000 0000 0001 0001 

Error Report 


0000 0000 1101 1101 


inconsistent 

data( 15:0) P ortion 



inconsistent portion the Secolld Processor 
Figure 67. Error Analysis for the Full Design 


D. CHAPTER SUMMARY 

It is exciting to see that this full design works in simulation. The three KDLX 
processors work in parallel and the design functions as desired. Confusion on how Inter¬ 
rupt or Reconciler works should have been cleared up by the material in this chapter. 

The program counter is not latched properly in Figure 59, but works perfectly in the full 
design. The timing issues of the simulation arise again. Changing the way to latch the 
program counter in the Interrupt to make it work in Figure 59 may cause the simulation 
of the full design to fail. 

The last component for a complete TMR design is the Error Syndrome Storage 
Device (ESSD). This is a device used to store error syndromes for future analysis. The 
full design with ESSD will be introduced in the next chapter. 
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IX. THE FULL DESIGN WITH ESSD 


After designing and simulating different components, the TMR design is almost 
completed. In the previous chapter, it has been shown that the voters are able to report 
and locate an error when it occurs. Errors on different buses will be reported by 
cid_l(50:0), cid_0(50:0), err(50:0), and v_err(50:0). The pattern generated for an error 
on these buses is called the error syndrome. 

A space system like CFTP will leave the earth for a long time. It is desired to 
have some kind of device to collect the error syndrome whenever an error occurs. The 
error syndrome can be used to analyze the health of the system or help understand the 
space environment for a system on orbit. If the same error is generated several times, it 
can be assumed that a certain device is defective or deviant. The solution may be to re¬ 
program the FPGA or reset the system. The ESSD is the device designed to collect error 
syndromes. In order to be able to download this data after a period of time, the ESSD has 
to store the error syndromes to memory. 

A. THE FUNCTION OF ESSD 

Simulation for the full design without ESSD was introduced in the previous chap¬ 
ter. Therefore, the functions of ESSD are to store the error syndromes and where they are 
located in the system. The ESSD is designed pretty much following the concept of build¬ 
ing the Interrupt. It is a state machine coded in VHDL and runs in double speed, that is 
in synchronization with the memory clock. It has to run in double speed in order to work 
with errors generated in either half of the KDLX clock cycle. Because the ISR will be 
triggered when an error occurs, choices for where ESSD is to be implemented are before, 
after or sometime within the ISR. 

Halting normal operation is the last choice since the ISR is already designed to do 
that. It is reasonable not to interrupt the normal operation unless absolutely necessary. 
Too many interruptions may decrease the performance of a system or cause the program 
to lose track of the instruction sequence. Due to these reasons, the ESSD is implemented 
in the ISR instead of triggering another interrupt routine somewhere in normal operation. 
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To minimize the impact on ISR, the ESSD is designed to start right before the first 
instruction in ISR begins. The two NOPs following the TRAP instruction are a good 
starting point for ESSD since the pipeline is cleaned and no useful instruction is execut¬ 
ing. Consolidating all of the concepts above, the state machine for ESSD is constructed 
as Figure 68 and its VHDL code is in Appendix C, section D. 



Figure 68. State Machine of ESSD 


The first eight states are very similar to the states in Interrupt. This is because the 
ESSD has to wait until two NOPs are inserted. The LatchState_A latches the program 

counter, the data address, and the 51-bit data on the cid_0 and cid_l buses. The Stall- 

108 






















State stalls KDLX in order to start storing the latched error syndromes. The ESSD stores 
data to memory as a stack which starts at the bottom and runs to the top. For simplicity 
and explanation purpose, we use address 0059i6 as the starting point and store data from 
the least significant bit to the most significant. This function is illustrated in Figure 69. 
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Figure 69. Function of ESSD Storing 


Each data word in memory is 24-bits wide so a 51-bit data syndrome takes three 
clock cycles to store. The most significant three bits of cid_0 and cid_l are stored with 
21 zeros ahead. A counter is used internal to ESSD to track the memory locations. The 
next error syndrome will start at address 51 16 . States from StoreStateO_A to Store- 
State _pc implement the actions described here. During this period, all of the processors 
are stalled and the memory is controlled by ESSD. The last state is the BackState which 
releases the processors to start the ISR. 
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The ESSD runs at twice the speed of the TMRA but states after the NopStatel_B 
are not doubled as the other state machines do. Because the ESSD and the memory are 
both in double speed, one memory access can occur in every ESSD state. Therefore, 
states between StoreStateO_A and BackState do not need to be duplicated. The Interrupt 
and Reconciler stop functioning when KDLX is stalled. The schematic symbol of ESSD 
is shown in Figure 70. 


essd 


clk_s 

reset_s 

err 

addr_in(15:Q) 
pc_in(15:0) 
cid0_in(50:0) 
cidl _in(50:0) 


addr_s(15:0) ; 
sel_addr(15:0) ; 
ess(23:0) _ 
sel_s(23:0) ; 


state_s(4:0) _ 
stall s - 


Figure 70. Schematic Symbol of ESSD 


Input signals at the left side are used for latching data from the buses. Output sig¬ 
nals, sel_addr(15:0), sel_s(23:0), and sel_wr are used to switch muxes in order to insert 
data on addr_s(15:0), ess(23:0), and wr_s, respectively. The stall_s goes low to stall 
KDLX when error syndromes are ready to be stored. 

B. THE FULL DESIGN WITH ESSD 

1. Schematic 

The schematic for the full design with ESSD is shown in Figure 71. Comparing 
with Figure 61, the ESSD is added at the bottom right and all incoming or outgoing buses 
are intercepted with muxes. The ESSD obviously takes over RAM s once it starts to store 
error syndromes. Three muxes at the input side of RAMs are used to insert the data ad¬ 
dress, data and write signal. The other three muxes on the output buses of RAM s are used 
to intercept any unrelated data to Reconciler while storing the error syndromes. 

Two big latches called latch51 are sitting on the cid_0 and cid l buses ahead of 
the ESSD. This part is coded in VHDL and is necessary for this design. It latches data 
when err is high and keeps the latched data until the next error is detected. Therefore, the 


110 



ESSD can capture cid_0 and cid_l whenever it wants because this data is available and 
stable on the bus. More explanation of how it functions and why it is vital in this design 
will be described in the simulation discussion. 
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Simulation 


Fewer signals are monitored here than with the full design in the previous chapter, 
since the test bench is almost identical except for a few extra instructions for checking 
stored error syndromes in memory. Functions of the TMRA, Interrupt and Reconciler in 
the full design without ESSD have been described so this simulation just shows how the 
ESSD works. Important signals and all buses on the ESSD are monitored in the simula¬ 
tion shown in Figures 72 and 74. This simulation ignores most identical parts introduced 
in the previous chapter. Only the important functions of the ESSD are shown for 
explanation. 



point 1 point 4 


Figure 72. Simulation of the Full Design with ESSD 


In Figure 72, five clocks are listed. The Reconciler, Interrupt and ESSD all work 
in parallel so the time constraints for clk ir and clk_s are identical. The new clock, clk_l, 
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for latch51 needs to run at double speed, and it has to be stable before the ESSD is ready. 
Because of this, the \atc\i5l has less setup and hold time comparing with the ESSD. 

As before, the error is caught at point 1 and cid_l, cid_0 indicate where the error 
is. One needs to know that cid l and cid_0 are output data of latch.51. Unlike the simu¬ 
lation in previous chapter, data on cid_l and cid_0 show up at point 2 and are latched un¬ 
til the next error is reported in normal operation. The ESSD, therefore, is able to store 
these two data when state_s is 02 16 . 

The most important reason for using Iatch51 is to make the data stable on the bus. 
The zoom in at point 5 in Figure 63 is shown in Figure 73. The data of cid_l and cid_0 is 
available after the memory clock cycle and becomes unstable before the next rising edge 
of the Interrupt or Reconciler clock cycle. Because the ESSD is running exactly the same 
clock speed as the Interrupt and Reconciler, both cid_l and cid_0 have to be available 
until the next rising edge of the Interrupt (or Reconciler ) clock in order to be latched cor¬ 
rectly for the ESSD. Due to this reason, the latch51 is designed to keep the data stable 
and the ESSD thus can latch it at any state before storing the error syndromes. 
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Figure 73. Detail Timing at point 5 in previous simulation 


Back to Figure 72, point 3 is the first instruction fetched in the ISR. At the same 
time the KDLX is fetching this instruction, the ESSD triggers stall_s at point 4 to stall the 
processors. In the next clock cycle, the muxes are switched to zeros and 0059i6 appears 
on the address bus to the RAMs. 
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Figure 74. Simulation of the Full Design with ESSD (continued) 


Following the algorithm explained in Figure 69, the bus ess at point 5 proves this 
function works. Once the ESSD finishes at point 6, it gives all of the buses back and re¬ 
leases the processors. The first instruction of the ISR starts in the next clock cycle. 

Extra instructions in the RAMs are for loading error syndromes stored in memory 
back to the registers for checking purposes. These instructions start at point 7 and the 
output data at point 8 proves that all values are stored correctly. 

C. CHAPTER SUMMARY 

All components for a complete design have been introduced. The reason for not 
discussing the ESSD until this chapter is to simplify the simulation. There were too many 
things that needed to be explained in the simulation result if the ESSD is not described 
separately. This would make the whole simulation look complicated and may not em¬ 
phasize the importance of the ISR. Introducing the ESSD separately means that the func¬ 
tions of the Reconciler, Interrupt, and ESSD are shown clearly in all simulations. 
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Not a conceptual design, this full design was simulated and checked. Design of 
these components can be improved and more infonnation is needed for a better perform¬ 
ance of the TMR system. These topics for follow-on research will be discussed in the 
next chapter. 


116 



X. CONCLUSIONS AND FOLLOW-ON RESEARCH 


This thesis has described the design of a premiere TMR design on an FPGA for 
the CFTP. Major components have been defined in previous theses but most of them had 
to be redesigned due to more understanding of the KDLX processor. Each component 
was simulated to prove its function. Some timing issues were discussed when different 
components were connected with each other. The full design has proved the ability to de¬ 
tect and correct an SEU in simulation as well. 

A. OVERVIEW 

The TMR Assembly consists of three KDLX processors and voters in order to de¬ 
tect and correct errors. A majority voter can only handle one error per time. Since the 
TMR Assembly has several voters in it, it is able to report errors on different signals si¬ 
multaneously. For example, cid_l and cid_0 buses of the TMRA can identify errors on 
the program counter and data at the same time. The processor causing errors on the pro¬ 
gram counter may not be the same one that generates errors on data. 

In order to coordinate memory access, the Reconciler is built to consolidate the 
Harvard and Von Neumann architectures. It runs twice as fast as the KDLX clock cycle 
and has instruction memory access first followed by the data memory access second. 

This component purely implements read and write access with memory and does not re¬ 
late directly to error detection or correction. The Interrupt provides an ISR to correct any 
inconsistency in registers between the three processors. This unit is triggered when an er¬ 
ror is found by the TMRA. If an error is caused somewhere on the bus but not inside reg¬ 
isters, the ISR will still be triggered but no error will be found. An error syndrome re¬ 
cords the program counter, the memory address, and any inconsistent bits on data, ad¬ 
dress, program counter, read, write and program read in cid buses. This information is 
latched in ESSD and will be stored to memory during the ISR. Analyzing error syn¬ 
dromes can help a designer to correct or fix the current design. 
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B. 


CONCLUSIONS 


A simple flow chart in Figure 75 illustrates the overall procedure to correct an er¬ 
ror in TMR. The role of each component in the full design can be understood clearly. 
The Interrupt is generated for error correction purpose only and the ESSD is for storing 
error syndromes only. 



Figure 75. Flowchart of Error Correction for TMR design 

A reprogrammable space device such as CFTP has a great potential for the future. 
The TMR on an FPGA functions as a SOC which saves space on board and offers the 
flexibility of modification. Utilizing the TMR design with some other features makes the 
CFTP act as an error-free device. Its powerful feature of reconfigurability widens its us¬ 
age in missions and lets the state-of-the-art technology be applied to many applications. 
C. FOLLOW-ON RESEARCH 

A premiere functioning TMR design is complete. This circuit was simulated and 
proved on software. It is possible to instantiate this design onto a development board to 
verify its function. Before doing that, some modifications need to be done. Perfonnance 
of each component can be improved as well. Furthennore, using a faster soft-core proc¬ 
essor to speed up the overall perfonnance of the TMR is inevitable. 
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1. Modification on Current Design 

Most components like Reconciler, Interrupt and ESSD are essentially state ma¬ 
chines coded in VHDL. It is possible to have these three in one big state machine since 
they all run in double speed. One needs to have a clear mind on the different functions of 
the different components in order to do this. Debugging this kind of big state machine 
needs to be carefully done since any modification on one state may affect functions on 
other states. On the other hand, there are several different ways to code a component. 
Other methodologies sometimes are better than using a state machine depending on 
characteristics of these different components. 

A voter error is not considered in this thesis due to time constraints. This kind of 
error does not need to trigger the ISR. When a voter votes incorrectly, the output is not 
trustful. The data can be either discarded or re-voted based on the situation. The ESSD 
may need to be revised so as not to save all error syndromes in order to save memory 
space. 

The memory selected for the simulation is based on the availability of the ISE 
software. If possible, a real Von Neumann architecture memory should be built. Modifi¬ 
cations on the TMRA and Reconciler will be necessary at that time. The real environment 
on the development board must be considered before these modifications. This avoids 
duplicate work and makes it possible to compare the simulation result on software with 
the one on hardware. 

An SEU can occur anywhere in the TMR design. More issues need to be solved 
if this error occurs on the Reconciler, Interrupt or ESSD. Increasing the reliability also 
increases the probability of having an SEU. The trade-off between these conditions 
needs more discussion. 

2. Faster Processors 

Several requirements are considered when searching for a faster processor. First, 
The new processor has to be faster than the current 16-bit RISC KDLX. Second, it has to 
be a soft-core processor. Third, it needs to be compatible with Xilinx Virtex XCV800 
HQ240 FPGA selected for the CFTP. Other features such as using cache or Harvard ar¬ 
chitecture can be reconsidered. 
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Many soft-core processors nowadays use cache to improve their perfonnance 
even though it is possible to have an SEU in it. Detecting and correcting an SEU in a 
cache cannot use the same method as with the registers. The contents of the caches need 
to be reloaded by some method. Study of the SEE on a Pentium®5 III processor proves 
that utilizing cache in different ways can change the testing result dramatically [12]. 
Therefore, it is possible to take advantage of cache without increasing the probability of 
having an error, and consideration of future processors should include ones with cache. 

Using a Von Neumann architecture processor would simplify the TMR design. 
The Reconciler can be removed and less control in TMRA are needed for the data bus. 


Table 21 lists some candidate commercial processors that are currently available. 


Commercial Processors 

Company 

Processor 

Architecture 

Features 

Xilinx 

MicroBlaze 

32-bit RISC 

1. No cache 

2. Harvard bus 

ARM 

ARM7TDML 

32-bit RISC 

1. Most have cache 

2. Von Neumann bus 

3. Hard core 

MIPS 

MIPS64 

5Kc(5Kf) 

64-bit RISC 

1. Programmable cache 0-64KB 

2. Co-processor interface 

3. Floating-point pipline 

4. Hard core 

MIPS 

MIPS64 20Kc 

64-bit RISC 

1. 32KBcaches 

2. Superscalar 

3. Hard core 

Sandcraft 

SR71010B 

64-bit RISC 

1. MIPS64 based 

2. LI 32KB cache 

Tensilica 

Xtensa 

32-bit RISC 

1. Local data and instruction caches 

Altera 

Nios 

32-bit RISC 

1. Instruction master is a 16-bit wide, la¬ 
tency-aware Avalon bus master 

2. Configurable cache size 

ARC 

ARCtangent-A4 

32-bit RISC 

1. Processor can be configured with Har¬ 
vard bus architecture (separate instruc¬ 
tion/data buses) or a von Neumann bus 
architecture (unified instruction/data 
buses) 

2. User-configurable instruction and data 
cache 


Table 21. Commercial Soft-Core Processors 


5 Pentium is a registered trademark of Intel Corporation. 
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Some processors have configurable cache which gives the user some flexibility. 
The advantage and disadvantage between a soft-core and a hard-core processor has been 
described in Chapter I so no hard-core processors are considered. Candidates for the 
TMR are MicroBlaze, SR71010B, Xtensa, Nios, and ARCtangent-A4. 

Commercial processors are always expensive because of the proprietary issues. 
Sometimes these processors come with their own development kit which makes imple¬ 
mentation on other software impossible. Part of the design of a commercial processor is 
sometimes protected by the company and not accessible for the user. Even though revis¬ 
ing a processor is not always required, studying source code is a good and fast way to un¬ 
derstand the processor itself. On the other hand, information of these commercial proces¬ 
sors is limited since only the data sheet on the Internet can be found most of the time. 

Sometimes people share their invention or modification of cores with the public. 
These cores may or may not be fully tested and usually the designer is looking for other 
people to test it. These cores are called OpenCores. OpenCores are free and can be eas¬ 
ily downloaded from the Internet. The disadvantage of using OpenCores is that they are 
hard to use. Some designers do not describe their design in detail and development tools 
vary from different designers. People post their questions on the website and hope some¬ 
one will answer it. Therefore, there is no customer support like the commercial proces¬ 
sors. Some Opencores are collected in Table 22. 

Some information is not complete due to the lack of description by designers or 
other users. These cores do not have many restrictions and can be modified if desired. 
Based on the information found, the SPARC and RISC R1000 are very common proces¬ 
sors. The RISC R1000 has been tested and successfully ran a video image program. 
Many devices are also compatible with this processor. The RISC R1200 is almost an 
identical processor with R1000 except for the cache inside. The Yellow Star which is ac¬ 
tually the MIPS32 R3000 processor is known as a very powerful processor. It has been 
tested by many users as well. 
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OpenCores 

Architecture 

Name 

Features 

SPARC V8 

LEON VHDL 

32 bit 

1. AMB A AHB and APB on-chip buses 

2. Data cache is a direct-mapped cache configurable to 
1-64 kbyte 

SPARC V7 

ERC32 

32 bit 

1. A radiation-tolerant processor developed for space 
applications 

2. Two platforms are supported: SPARC Solaris-2.5.1 
(or higher),and x86 linux (libc5) 

3. VHDL model runs on Unix systems 

RISC 

OpenRisc R1000 
32 bit 

1. Tested on Xess XSV800 and Flextronics Semicon¬ 
ductor development boards 

RISC 

OpenRisc R1200 
32 bit 

1. Tested on Xess XSV800 and Flextronics Semicon¬ 
ductor development boards 

2. cache 

RISC 

Yellow Star 
(MIPS32 R3000) 
32 bit 

1. Capable of executing 32bit instructions based on the 
MIPS R3000 microprocessor instruction set and has 
been tested running large blocks of compiled C code. 

2. Fully functional and compatible interrupt system. Can 
handle all exceptions cleanly and correctly. 

3. On-chip cache control and Memory Management Unit 

RISC 

Rise 16f84 

1. The "riscl6f84 clk2x.v" core has been coded com¬ 
pletely, synthesized and tested for correct operation 
(and debugged!) inside a Xilinx XC2S200 FPGA 

RISC 

Plasma 

1. Support interrupts and all MIPS I(TM) user mode in¬ 
structions except unaligned load and store operations 
(which are patented) and exceptions which can be eas¬ 
ily avoided. 

2. Tested on an Altera FPGA running at 16.5 MHz (syn¬ 
thesized for 29.8 MHz) 

3. Currently running on an Altera EP20K200EFC484- 
2X FPGA and a Xilinx FPGA 


Table 22. OpenCores 


These OpenCores are tested and proved with certain FPGAs. In order to use these 
processors in the TMR design, more study and research on source codes are required. 
Finally, they will need to be tested and simulated on the ISE software before any design 
work related to the TMR. 
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APPENDIX A: SCHEMATICS 


Appendix A contains all schematics, test benches and simulation results of the 
components in this thesis. Simple schematic symbols are introduced as figures and are 
not included here. Features and settings of each component and test bench are briefed as 
well. The long test bench is chopped into pieces and only the important parts are shown. 
Sometimes a different expression is used in order to explain how a component will be 
tested. 


The simulation result is always shown completely. Important parts that need to be 
explained are duplicated or modified in contents. All values used in the test bench and 
the simulation result are hexadecimal and RO is always zero. 

A. 24-BIT MEMORY 
1. Schematic 


This memory is a RAM. It is triggered at the rising clock edge. Both write en¬ 
able (i.e., WE) and memory enable (i.e., EN) pins are active low. Default value of this 
memory is zero. 
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2. Test Bench 


This test bench was originally in a single row. It is cut into two rows in order to 
fit the paper size. The vertical line at time 2100 ns is the stop point of the simulation. 
Clock high time and low time is 50 ns. Input setup time and output valid delay is 10 ns. 
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Simulation Result 
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B. KDLX WITHOUT MEMORY 
1. Schematic 


dlx 



2. Test Bench 


The data bus is high impedance. Two values are offered at clock 5 and 6 for 
KDLX to load into registers. Clock high time and low time is 50 ns. Input setup time 
and output valid delay is 10 ns. 
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Simulation Result 



1. Schematic 


The instruction memory at the left side is a ROM. The data memory at the right 
side is a RAM. Data memory is pre-configured with 0003i 6 . Both memories are trig¬ 
gered at the rising clock edge. 


125 




































126 




2 . 


Test Bench of Instruction Set 


For the processor, clock high time and low time is 50 ns; input setup time and out¬ 
put valid delay is 10 ns. For memories, all timing settings are half of the processor clock. 
The bi-directional bus is high impedance. 


Nothing special is needed in the test bench thus only the first and last parts are 
shown here. The KDLX is reset and memories are enabled at time 200 ns. Since the in¬ 
struction is configurable, the test benches for all instructions sets are the same. 
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3. Tables and Simulation Results of Instruction Sets 


a. Implementation Table of Instruction Set 1 


Instruction (operation symbol) 

Opcode 

Expected Value 

LW 

Rl<—Mem(R0+03) 

440103 


SW 

Rl—»Mem(R0+08) 

450108 

0003 

LW 

R2<—Mem(R0+04) 

440204 


SW 

R2—»Mem(R0+09) 

450209 

0003 

ADD 

R1+R2^R3 

011320 


SW 

R3 ->Mem(R0+0D) 

45030D 

0006 

ADDI 

Rl+ext(F9)—>R4 

4114F9 


SW 

R4—>Mem(R0+0E) 

45040E 

FFFC 

ADDUI 

R1+(0A) -^R5 

21150A 


SW 

R5 —»Mem(R0+0F) 

45050F 

000D 

AND 

R1«R3->R6 

091630 


SW 

R6->Mem(R0+10) 

450610 

0002 

ANDI 

R4*(FD)^R7 

2947FD 


SW 

R7 —»Mem(R0+11) 

450711 

00FC 
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Instruction (operation symbol) 

Opcode 

Expected Value 

LHI 

R8<—FF (0) 8 

0808FF 


SW 

R8—>Mem(R0+12) 

450812 

FF00 

OR 

R1+R3—»R9 

0A1930 


SW 

R9—>Mem(R0+13) 

450913 

0007 

OR I 

R1+(F0)—>R10 

2A1AF0 


SW 

R10->Mem(R0+14) 

450A14 

00F3 

SEQ 

R1=R2—>R11=1 

181B20 


SW 

R11—>Mem(R0+15) 

450B15 

0001 

SEQ 

R1*R3^R12=0 

181C30 


SW 

R12->Mem(R0+16) 

450C16 

0000 

SEQI 

R1=(0003)^R13=1 

581D03 


SW 

R13 —»Mem(R0+17) 

450D17 

0001 

SEQI 

R1*(0004)^R14=0 

581E04 


SW 

R14->Mem(R0+18) 

450E18 

0000 

SLL 

r 4 ^ R 2 =( 0003)^ r]5 

114F20 


SW 

R15—>Mem(R0+19) 

450F19 

FFE0 

SLLI 

R4 <— (0005)— ^ R3 

514305 


SW 

R3->Mem(R0+lA) 

45031A 

FF80 

SRA 

R4 _> R1 = (0003) —> R5 

134510 


SW 

R5—>Mem(R0+ IB) 

45051B 

FFFF 

SRLI 

R 4 _> ( 0003)—^ R6 

524603 


SW 

R6->Mem(R0+lC) 

45061C 

1FFF 

SUBI 

R8-ext(7B)^R7 

43877B 


SW 

R7 —»Mem(R0+ ID) 

45071D 

FE85 

XOR 

R9©R10^R11 

0B9BA0 


SW 

Rll->Mem(R0+lE) 

450B1E 

00F4 
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b. 


Simulation Result of Instruction Set 1 


/testbench/dk_p 

/testbench/clk_ram_rom 

/testbench/en_rom 

/testbench/en_ram 

/testbench/reset_p 

/testbench/stall_p 
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/testbench/instr pass 

/testbench/out mem 

/testbench/ prog_rd_p 

/testbench/read_p 

/testbench/write_p 

/testbench/data_p 
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/testbenctyclk_p 
/testbench/dk_ram_rom 
/testbench/en_rom 
/testbench/en_ram 
/testbenciy reset_p 
/testbench/st3ll_p 
/testbench/instr_pass 
/testbench/cxjt_mem 
/testbench/prog_rd_p 
/testbench/ read_p 
/testbench/write_p 
/testbench/data_p 


/testbench/clk_p 
/testbench/dk_ram_rom 
/testbench/en_rom 
/testbench/en_ram 
/testbench/reset_p 
/testbench/stall_p 
/testbench/ instr_pass 
/testbench/out_mem 
/testbench/ prog_rd_p 
/testbench/ read_p 
/testbenctV write_p 
/testbench/data_p 
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c. 


Tables of Registers and Memories in Simulation 1 


Instruction Mem 

00 


2D 

45071D 

01 

440103 

2E 

450B1E 

02 

440204 

2F 

000000 

03 

000000 

30 

000000 

04 

000000 

31 

000000 

05 

450108 

32 

450101 

06 

450209 

33 

450201 

07 

000000 

34 

450301 

08 

011320 

35 

450401 

09 

4114F9 

36 

450501 

0A 

21150A 

37 

450601 

OB 

000000 

38 

450701 

OC 

091630 

39 

450801 

0D 

45030D 

3A 

450901 

0E 

45040E 

3B 

450A01 

OF 

45050F 

3C 

450B01 

10 

450610 

3D 

450C01 

11 

2947FD 

3E 

450D01 

12 

0808FF 

3F 

450E01 

13 

0A1930 

40 

450F01 

14 

2A1AF0 

41 

000000 

15 

450711 

42 

000000 

16 

450812 

43 

000000 

17 

450913 

44 

44010D 

18 

450A14 

45 

44020E 

19 

181B20 

46 

44030F 

1A 

181C30 

47 

440410 

IB 

581D03 

48 

440511 

1C 

581E04 

49 

440612 

ID 

450B15 

4A 

440713 

IE 

450C16 

4B 

440814 

IF 

450D17 

4C 

440915 

20 

450E18 

4D 

440A16 

21 

114F20 

4E 

440B17 

22 

514305 

4F 

440C18 

23 

134510 

50 

440D19 

24 

524603 

51 

440E1A 

25 

450F19 

52 

440F1B 

26 

45031A 

53 

44011C 

27 

45051B 

54 

44021D 

28 

45061C 

55 

44031E 

29 

43877B 

56 

000000 

2A 

0B9BA0 

57 

000000 

2B 

000000 

58 

000000 

2C 

000000 

59 

000000 


Register 

00 



01 

0003 


02 

0003 


03 

000© 

FF80 

04 

FFFC 


05 

000© 

FFFF 

06 

0002 

1FFF 

07 

©©EG 

FE85 

08 

FFOO 


09 

0007 


10 

00F3 


11 

0004 

00F4 

12 

0000 


13 

0001 


14 

0000 


15 

FFEO 



Data Mem 

00 


01 


02 


03 


04 


05 


06 


07 


08 

0003 

09 

0003 

OA 


OB 


OC 


OD 

0006 

OE 

FFFC 

OF 

000D 

10 

0002 

11 

OOFC 

12 

FFOO 

13 

0007 

14 

00F3 

15 

0001 

16 

0000 

17 

0001 

18 

0000 

19 

FFEO 

1A 

FF80 

IB 

FFFF 

1C 

1FFFF 

ID 

FE85 

IE 

00F4 

IF 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


2A 
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d. Implementation Table of Instruction Set 2 



Instruction (pseudo code) 

Opcode 

Expected Value 

SGE 

R1>R3—>R13=1 

191D30 


sw 

R13—>Mem(R0+lF) 

450D1F 

0001 

SGE 

R15>R14—>R9=0 

19F9E0 


SW 

R9—»Mem(R0+20) 

450920 

0000 

SGEI 

R15>ext(E8)—>R10=0 

59FAE8 


SW 

R10—>Mem(R0+21) 

450A21 

0000 

SGEI 

R15>ext(E0) —>R11=1 

59FBE0 


SW 

R11—»Mem(R0+22) 

450B22 

0001 

SGT 

R4>R15^R6=1 

1A46F0 


SW 

R6->Mem(R0+23) 

450623 

0001 

SGT 

R15>R4^R7=0 

1AF740 


SW 

R7->Mem(R0+24) 

450724 

0000 

SGTI 

R15>ext(FF)^R8=0 

5AF8FF 


SW 

R8->Mem(R0+25) 

450825 

0000 

SGTI 

R15>ext(87)^-R9=l 

5AF987 


SW 

R9->Mem(R0+26) 

450926 

0001 

SLE 

R1=R2^R10=1 

1B1A20 


SW 

R10->Mem(R0+27) 

450A27 

0001 

SLE 

R1<R13->R11=0 

1B1BD0 


SW 

R11—>Mem(R0+28) 

450B28 

0000 

SLEI 

Rl<ext(03)^R12=l 

5B1C03 


SW 

R12->Mem(R0+29) 

450C29 

0001 

SLEI 

Rl<ext(02)—>R13=0 

5B1D02 


SW 

R13->Mem(R0+2A) 

450D2A 

0000 

SLT 

R15<R1^>R6=1 

1CF610 


SW 

R6->Mem(R0+01) 

450601 

0001 

SLT 

R1<R15->R7=0 

1C16F0 


SW 

R7 —>Mem(R0+02) 

450702 

0000 

SLTI 

Rl<ext(0D)—>R8=1 

5C180D 


SW 

R8->Mem(R0+03) 

450803 

0001 

SLTI 

R1 <ext(01)—>R9=0 

5C1901 


SW 

R9—»Mem(R0+04) 

450904 

0000 

SNE 

R1^R2^R10=0 

1D1A20 


SW 

R10—»Mem(R0+05) 

450A05 

0000 

SNE 

R1^R15->R11=1 

1D1BF0 


SW 

Rll->Mem(R0+06) 

450B06 

0001 

SNEI 

Rl^ext(03)^R12=l 

581C03 


SW 

R12->Mem(R0+07) 

450C07 

0001 

SNEI 

R15^ext(El)—>R13=0 

58FDE1 


SW 

R13—>Mem(R0+08) 

450D08 

0000 

SRAI 

R3 ^(0006)^ R6 

533606 
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Instruction (pseudo code) 

Opcode 

Expected Value 

SW 

R6—>Mem(R0+09) 

450609 

FFFE 

SRF 

r3^r 2 =(0003)_ >R7 

123720 


SW 

R7^Mem(R0+0A) 

45070A 

1FF0 

XORI 

R15©(8A)^-R8 

2BF88A 


SW 

R8—>Mem(R0+0B) 

45080B 

FF6A 

SUBUI 

R3-(80)—>R9 

233980 


SW 

R9^Mem(R0+0C) 

45090C 

FF00 

SUB 

R1-R3^R14 

031E30 


SW 

R14—>Mem(R0+0D) 

450E0D 

0083 


e. Simulation Result of Instruction Set 2 


/testbench/dk_p 

/testbench/d k_ram_rom 

/testbench/en_rom 

/testbench/en_ram 

/testbench/reset_p 

/ testbench/sta 11_ p 

n_n_n_n 

fUTTUl 

n_n_n_n 

ruiruT 

muin 

mum 

rLTLTLTL 

mum 

n_n_n_rL 



































/testbench/instr pass 

/testbench/out mem 

/testbench/prog_rd_p 

/testbench/read_p 

/testbench/write_p 

/testbench/data_p 

000000 

1410103 1410203 10803FF I0804FF I0805FF 108061F 1410380 14104FC I4105FF 12166FF I0807FE I0808FF I080FFF J210AF3 1217785 I“— 










oooo 

|0003 








rtj - 
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/testbench/clk_p 

/testbench/dk_ram_rom 

J LJ l_ 
rLTLTLTL 

J—l__l—L 

r _ L_rn_ 

rLTUlTL 

rLTLTLTL 

n_n_ 

r _ L_r _ L_ 



/testbench/en_ram 



















/testbench/reset_p 










/testbench/stall_p 

/testbench/instr_pass 










I5AF987 1450623 >450724 X450 

325 X450926 UB1 

V20 IlBIBDO I5B1C03 I5B1D02 I450A27 {450B28 X450C29 >450D2A XlCF 

>10 ||lC17F0 ISCI 

SOD 1 50901 1- 

/testbench/out mem 

[cooiTffeO 10003 JFFE 

1 J0003JM01JM03 

,OD(X) ID0O3 ICWCO [0003 

oooilFFEO 10003 JFFEO 10003 I 0001 IW 03 


»ffeo^TS7 

/testbench/prog_rd_p 


J — 1 _ 1 — 1_ 

ri-j-L 

ri-j-L 


j i_i i_ 

1 1_ 1 L_ 



/testbench/ read_p 










/testbench/write_p 

/testbench/data_p 

J 


— 1 L 





) ^ ^ 
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f. Tables of Registers and Memories in Simulation 2 


Instruction Mem 

00 


30 

450A21 

01 

410103 

31 

450B22 

02 

410203 

32 

1A46F0 

03 

0803FF 

33 

1AF740 

04 

0804FF 

34 

5AF8FF 

05 

0805FF 

35 

5AF987 

06 

08061F 

36 

450623 

07 

410380 

37 

450724 

08 

4104FC 

38 

450825 

09 

4105FF 

39 

450926 

0A 

2166FF 

3A 

1B1A20 

OB 

0807FE 

3B 

1B1BD0 

OC 

0808FF 

3C 

5B1C03 

0D 

080FFF 

3D 

5B1D02 

0E 

210AF3 

3E 

450A27 

OF 

217785 

3F 

450B28 

10 

210BF4 

40 

450C29 

11 

410907 

41 

450D2A 

12 

410D01 

42 

1CF610 

13 

410E00 

43 

1C17F0 

14 

410C00 

44 

5C180D 

15 

410FE0 

45 

5C1901 

16 

000000 

46 

450601 

17 

000000 

47 

450702 

18 

450100 

48 

450803 

19 

450200 

49 

450904 

1A 

450300 

4A 

1D1A20 

IB 

450400 

4B 

1D1BF0 

1C 

450500 

4C 

581C03 

ID 

450600 

4D 

58FDE1 

IE 

450700 

4E 

450A05 

IF 

450800 

4F 

450B06 

20 

450900 

50 

450C07 

21 

450A00 

51 

450D08 

22 

450B00 

52 

533603 

23 

450C00 

53 

123720 

24 

450D00 

54 

2BF88A 

25 

450E00 

55 

233980 

26 

450F00 

56 

031E30 

27 

000000 

57 

450609 

28 

000000 

58 

45070A 

29 

000000 

59 

45080B 

2A 

191D30 

5A 

45090C 

2B 

19F9E0 

5B 

450E0D 

2C 

59FAE8 

5C 

000000 

2D 

59FBE0 

5D 

000000 

2E 

450D1F 

5E 

000000 

2F 

450920 

5F 

000000 


Register 

00 



01 

0003 

0003 

02 

0003 

0003 

03 

FF80 

FF80 

04 

FFFC 

FFFC 

05 

FFFF 

FFFF 

06 

1FFF 

FFFE 

07 

FE85 

1FFO 

08 

FFOO 

FF6A 

09 

0007 

FFOO 

10 

00F3 

0000 

11 

00 F4 

0001 

12 

0000 

0001 

13 

0001 

0000 

14 

0000 

0083 

15 

FFEO 

FFEO 


Data Mem 

00 


01 

0001 

02 

0000 

03 

0001 

04 

0000 

05 

0000 

06 

0001 

07 

0001 

08 

0000 

09 

FFFE 

OA 

1 FFO 

OB 

FF6A 

OC 

FFOO 

OD 

0083 

OE 


OF 


10 


11 


12 


13 


14 


15 


16 


17 


18 


19 


1A 


IB 


1C 


ID 


IE 


IF 

0001 

20 

0000 

21 

0000 

22 

0001 

23 

0001 

24 

0000 

25 

0000 

26 

0001 

27 

0001 

28 

0000 

29 

0001 

2A 

0000 
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g. Implementation Table of Instruction Set 3 



Instruction (pseudo code) 

Opcode 

Expected Value 

LW 

Rl<—Mem(R0+03) 

410103 


LW 

R2<—Mem(R0+04) 

410204 


LW 

R3<—Mem(R0+00) 

410300 


LW 

R4<—Mem(R0+06) 

410406 


BNEZ 

Rl^O—>Prog_Addr<— (05)+l+ext(04) 
Note: PC=05 and (05)+l+ext(04)=0A 

CO1004 


BEQZ 

R3=0—>Prog_Addr<—(0A)+ l+ext(04) 
Note: PC=0A and (0A)+l+ext(04)=0F 

C13004 


ADDI 

R0+ext(25)^R5 

410525 


J 

(0020)—>Prog_Addr 

C80020 


JAL 

(0014)—»Prog_Addr; (23)—»R15 
Note:(23) is return address 

E80014 


ADDI 

R0+ext(8A)—>R6 

41068A 


ADDI 

R0+ext(40)—>R7 

410740 


ADD 

R1+R2^R8 

011820 


ADD 

R1+R4^R9 

011940 


SW 

R15 —>Mem(R0+01) 

450F01 

0023 

JALR 

R5^Prog_Addr; (1D)->R15 
Noter:(lD) is return address 

685000 


J 

(0030)—»Prog_Addr 

C80030 


SW 

R5—>Mem(R0+02) 

450502 

0025 

SW 

R6—>Mem(R0+03) 

450603 

FF8A 

SW 

R7->Mem(R0+04) 

450704 

0040 

SW 

R8—>Mem(R0+05) 

450805 

0007 

SW 

R9—>Mem(R0+06) 

450906 

0009 

SW 

R15->Mem(R0+07) 

450F07 

001D 

JR 

R7—>Prog_Addr 

487000 


SW 

R2->Mem(R0+08) 

450208 

0004 
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h. 


Simulation Result of Instruction Set 3 


/testbench/dk_p 

/testbench/clk_ram_rom 

/testbench/en_rom 

/testbench/en_ram 

/testbench/reset_p 

/testbench/ stal l_p 

n_n_n_n 

mruL 

rLTLTLn 

ruirm 

rLTLTLTL 

“LTLTLrL 

mna 

njinji 

rLTLTLTL 

















— i 


















000000 

J410103 1410204 1410300 [410406 [C01004 [000000 JC13004 1410525 1000000 JC80020 1000000 [E80014 141068A 1" • 

/testbench/out mem 

/testbench/prog_rd_p 

/testbench/ read_p 

/testbench/ write_p 

/testbench/ data_p 










0000 

[0003 
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Tables of Registers and Memories in Simulation 3 


Z 

Z 

Z 

& 

C- 


z 


:* 

z 

z 


z 

z 

z 

z 

z- 

JZr 


In stru ctio n M e m 

00 


0 1 

410103 

02 

410204 

03 

410300 

04 

410406 

05 

C 0 1 004 

06 

000000 

07 

000000 

08 


09 


0A 

C13004 

OB 

410525 

0C 

000000 

0D 


OE 


OF 

C 80020 

1 0 

000000 

1 1 

000000 

1 2 


1 3 


1 4 

011820 

1 5 

011940 

1 6 

4 5 0 F 0 1 

1 7 

000000 

1 8 

000000 

1 9 

000000 

1 A 

685000 

1 B 

000000 

1 C 

000000 

1 D 


1 E 


1 F 


20 

E 8 0 0 1 4 

2 1 

41 068A 

22 

410740 

23 


24 


25 

C 80030 

26 

000000 

27 

000000 

28 


29 


2 A 


30 

450502 

3 1 

450603 

32 

450704 

33 

450805 

34 

450906 

35 

4 5 0 F 0 7 

36 

487000 

37 

000000 

38 

000000 

39 



40 

450208 

4 1 

000000 

42 

000000 

43 

000000 


R eg iste r 

00 


01 

0003 

02 

0004 

03 

0000 

04 

0006 

05 

0025 

06 

F F 8 A 

07 

0040 

08 

0007 

09 

0009 

1 0 


1 1 


1 2 


1 3 


1 4 


1 5 



3—HR 1 5 


D a ta Mem 

00 


01 

0023 

02 

0025 

03 

F F 8 A 

04 

0040 

05 

0007 

06 

0009 

07 

00 1 D 

08 

0004 

09 


OA 


OB 


OC 


OD 


OE 


OF 


1 0 


1 1 


1 2 


1 3 


1 4 


1 5 


1 6 


1 7 


1 8 


1 9 


1 A 


1 B 


1 C 


1 D 


1 E 


1 F 


20 


21 


22 


23 


24 


25 


26 


27 


28 


29 


2 A 



138 









































































































































j. Implementation Table of Instruction Set 4 



Instruction (operation symbol) 

Opcode 

Expected Value 

ADDI 

R0+ext(04)—>R1 

410104 


ADDI 

R0+ext(07)—»R2 

410207 


TRAP 

(0020)—>Prog_Addr ; (06)—»IAR 
Note: (06) is return address 

280020 


ADDI 

R0+ext(09)—»R3 

410309 


ADDI 

R0+ext(15)—>R4 

410415 


ADDI 

R0+ext(0A)—>R7 

41070A 


ADDI 

R0+ext(ll)^R8 

410811 


ADDI 

R0+ext(C2)^R10 

410AC2 


RFE 

(06)—»Prog Addr 

Note: (06) is IAR 

F80000 


J 

(0011)—>Prog_Addr 

C80011 


SW 

Rl—»Mem(R0+01) 

450101 

0004 

sw 

R2->Mem(R0+02) 

450202 

0007 

SW 

R3->Mem(R0+03) 

450303 

0009 

sw 

R4—»Mem(R0+04) 

450404 

0015 

sw 

R7->Mem(R0+07) 

450707 

000A 

sw 

R8—>Mem(R0+08) 

450808 

0011 

sw 

Rl 0->Mem(R0+0A) 

450A0A 

FFC2 
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Simulation Result of Instruction Set 4 
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1 . 


Tables of Registers and Memories in Simulation 4 
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D. TMR ASSEMBLY WITHOUT MEMORIES 
1. Schematic 

This is the design without the latch at the bottom. Three KDLX processors are at 
the left and the six voters at the center. Signals such as V_ERR , CID l, CID O, and ERR 
are collected individually to four buses at the right. The read signal is used to enable 
buffers for data from memory. The write signal is used to enable buffers for data to 
memory. 
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2 . 


Test Bench 


The clock high and low times are each 50 ns. The input setup time and output 
valid delay times are each 10 ns. Since there are only two instructions, the test bench 
looks simple. It loads data in registers and stores back to memory to check whether this 
schematic works properly. 
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I:RR[50:0| CD _ 
V_FRR(50:0)O “ 
addr p|l5:0| CD “ 
data p| 15:0) dJ 
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3. Simulation Result 

As described in Chapter V this schematic without a latch does not write correct 
data into the registers due to a timing problem. This kind of error disappears when 
memories are connected. Because this appendix only displays the final design of each 
component, the imperfect simulation result is still contained here. The TMR with a latch 
is discussed in Chapter V so it is not contained here even though it works perfectly with¬ 
out memories. 
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E. TMR ASSEMBLY WITH MEMORIES 
1. Schematic 

This schematic uses the TMR Assembly without a latch. The instruction memory 
on the left side sends one instruction to the three processors at the same time. Therefore, 
this schematic is used only for checking basic functions. Nothing related with fault toler¬ 
ant can be tested here. 
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2 , 


Test Bench 


Since the instruction is pre-configured in ROM and RAM has default value 0003 16 , 
no data needs to be assigned. The test bench ends at 2900 ns. The clock high and low 
times for both memories and processors are each 50 ns. The input setup time and output 
valid delay are 10 ns for processors and 5 ns for memories. 
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3. Simulation Result 
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F. FAULT-TOLERANT TESTING 
1. Schematic 

This simulation uses three ROMs to achieve the goal of inserting different instruc¬ 
tions. This simulates the condition whenever three processors have inconsistent instruc¬ 
tions. The TMRA can also be modified to connect with three different RAMs. Then the 
simulation will be more complex and much more time needed for analysis. As discussed 
in Chapter V, such errors should be caught and corrected by the voters as long as no more 
than one SEU occurs in a voter. 
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2 . 


Test Bench 


The memories are pre-con figured so no special settings are needed in this test 
bench. The simulation ends at 3400 ns. The clock high and low times for both memories 
and processors are each 50 ns. The input setup time and output valid delay are 10 ns for 
processors and 5 ns for memories. 
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3. Memories Pre-configuration 


Only one instruction is different in each address of ROMs. This avoids multiple 
errors being sent to the voters at the same time. The RAM contains non-repeated data in 
each address. Details on how to read the error detection signal and analyze the error are 
discussed in Chapter V. 


ROMB 

00 

000000 

01 

000000 

02 

000000 

03 

44010A 

04 

44020B 

05 

440A0C 

06 

44040D 

07 

000000 

08 

000000 

09 

000000 

0A 

000000 

0B 

450103 

OC 

450207 

0D 

450309 

0E 

450410 


RAM 

00 

20 

01 

21 

02 

22 

03 

23 

04 

24 

05 

25 

06 

26 

07 

27 

08 

28 

09 

29 

0A 

2A 

0B 

2B 

OC 

2C 

0D 

2D 

0E 

2E 


ROM C 

00 

000000 

01 

000000 

02 

000000 

03 

44010A 

04 

44020B 

05 

44030C 

06 

350911 

07 

000000 

08 

000000 

09 

000000 

0A 

000000 

0B 

450103 

OC 

450208 

0D 

450302 

0E 

450410 


ROMA 

00 

000000 

01 

000000 

02 

000000 

03 

44010A 

04 

440203 

05 

44030C 

06 

44040D 

07 

000000 

08 

000000 

09 

000000 

0A 

000000 

0B 

450106 

OC 

450208 

0D 

450309 

0E 

450410 


151 









4 


Simulation Result 


/testbench/clk p 

|-1 

J -1 _ 




J - 1 

J - 









/testbench/clk m 

J 1 _ 

J 1 _ 

J 1 _ 






• 







/testbench/en_rom 















/testbenc h/en_ram 












/testbench/addr rom 




101 


l03 

J04 

/testbench/addr ram 








— toi 















/testbench/ instrpassa 

000000 





1440 10A 

1440203 








/testbench/instr_passb 

000000 





1440 10A 

I44020B 









/testbench/i nstr_passc 

/testbench/ reset_p 

000000 





1440 10A 

144020B 













/testbench/sta li_p 








/testbench/prog _p 

/testbench/read_p 

/testbench/ write_p 

/testbenc h/dd_l 

/testbench/cid_0 

/testbench/err 

/testbench/verr 

/testbench/data_p 






















Ixxxxxxxxxoooo 



IXXXXXOOOOOOOO 

IXXXXXOOOOOOOO 

IXXXXXOOOOOOOO 

- 











I 

I 

IXXXXXOOOOOOOO 

i 








1 xxxxxxxxxoooo 

XXXXXOOOOOOOO 


lxxxxxoooooooo 


^XXXXXOOOOOOOO 

X— 








|5000 








_ 






















0000 



70020 













/testbench/clk p 

J -1 _ 

J -1 _ 










/testbench/clk m 

J |_ 



























t05 

too 

H07 











/testbench/addr_ram 00 

Jo* _ 

Hob_ 

ioc 




/testbench/instr passa 440203 

I44030C 

144040D 

Toooooo 











/testbench/instr passb 44020B 

7440A0C 

144040D 

7oooooo 











/testbench/instr passe 44020B 

I44030C 

7350911 

7000000 











/testbench/ reset_p 







/testbench/ sta 1 i_p 

/testbench/prog p 







J 1 

J 1 












/testbench/read_p 

1 

J 1 










/testbench/ write_p 







/testbench,'c cl I , 

1 IXXXXXOOOOOOOO ]aoooaoooooooo 

1 1 xxxxxdooooooo Jaocooooaooooo 

I lxxxxxtxxxxxf68 |oooaoooooo:«3A 

I IXXXXXOOOOOOOO 


4 







1 lxxxxxoooooow ]aco«w.oooo<c 

I JXXXXXDOOOOOOO Joooooooooooot 

1 XxXXXXOGOOOOSS 2 g00000000006A 

I IXXXXXOOOOOOOO 

IXXXXXOOOOOOOO 

4 






/testbench/err . I 1 

1 "1. 

I — :. 

I JXXXXXOOOOOD68|00000000000»* 

IXXXXXOOOOOOOO 

IXXXXXOOOOOOOO 







/testbench/v err ’ ■ ■' 1 ] 


< Jooooocooococo IXXXXXOOOOOOOO 



/testbench/data_p 


/testbench/in_mem- 

/testbench/out_mem 0 020 


/testbench/dk_p 
/testbench/clkm 
/ testbench/enrom 
/testbenc h/en_ram 
/testbench/add rrom 
/test bene h/addrram 
/testbench/instr_passa 
/testbench/instr_passb 
/testbench/ j nstr_passc 
/testbench/ reset_p 
/testbench/ stal l_p 
/testbench/prog_p 
/testbench/read_p 
/testbench/ write_p 
/testbench/od_l 
/testbench/od_0 
/testbench/err 
/testbench/v_err 
/testbench/ da ta_p 
/testbenc h/in_mem 
/ testbenc h/out_mem 









a»70B 

~Ioc 

lOD 

lOE 


~Iio 

Hu 

00 ~ 


J°3 

l08 


Jl° 

~Ioo 

7450106 


74sn^no - 

7450410 - 


— 









7450103 

7450207 

7450309 

7450410 











7450103 



1450410 

























J 1 

J 1 _ 




















IXXXXXOOOOOOOO 

_IXXXXXOOOOOOOO 

_ j i _ i _; . 

'8 J)CXXXXOCOCO058 J00160COCOCC 

S8 IXXXXXOOOOOOOO looiasoooooa 

*>1— 

IXXXXXOOOOOOOO 

IXXXXXOOOOOOOO 

i : - 

i : 


i : — - 

i— 


IXXXXXOOOOOOOO I XXXXXOOOOOOOO lx)oxx000CT30Z8l0twQ300CM«i28 I><x..xmioo(X) 078 ]oooo3Qooo(W7a I>o<y.)o<ncoix)p58^ooi6ocococo5a I>;>:x.)ococococoQ!(oQi6aoocooooo I-fQ 

_ i _ ~ I ~ ' - _ 4 ~ ' { " 

XXXXXOOOOOOOO ~ ]OQ3(MOOOODOQO ]>»>:xxocodocoq ^oooooqocodooo ^xxxxxocojcboo ^oooooix)(X)oooo~]( xx xioxm 000300 jjooooooooooooo 


152 






























































































































































































G. 


RECONCILER 


1. Schematic 


rec 



2. Test Bench 


The clock high and low times are each 50 ns. The input setup time and output 
valid delay are each 10 ns. Manually set values in the data address, the program counter 
and the data were used to distinguish which one was fetched. 
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3, Simulation Result 
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H. 


RECONCILER WITH KDLX AND MEMORY 
1. Schematic 
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2 . 


Test bench 


The clock high and low times for KDLX, Reconciler, and memory are 50 ns, 25 
ns, and 25 ns, respectively. The input setup times and output valid delays for KDLX, 
Reconciler, and memory are 8 ns, 9 ns, and 10 ns, respectively. 
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I. INTERRUPT 

1. Schematic 

The rfe_i(23:0) is used to monitor the RFE instruction. The pc_in(15:0) is con¬ 
nected to the program counter of KDLX. The signal sel_i(23:0) controls the muxes in 
order to insert the TRAP and Jump instruction sent out from trap_i(23:0). 


interrupt 



2. Test Bench 

Random numbers are assigned to rfe_i(23:0) and pc_in(15:0). An RFE instruc¬ 
tion at time 900 ns emulates the end of the ISR. 
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3. Simulation Result 
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J. INTERRUPT WITH KDLX AND MEMORY 
1. Schematic 

The Reconciler is not included in this schematic so two memories are used for a 
Harvard architecture. In this design, the Interrupt only needs to monitor the instructions 
from the ROM. The error signal is triggered manually in the test bench. Once the ISR 
starts, the instruction on the bus will be replaced with the TRAP instruction and lead the 
KDLX to implement the specific ISR. The last instruction in the ISR is the RFE instruc¬ 
tion which activates the Interrupt to insert a new Jump instruction into KDLX. Then the 
circuit goes back to its normal operation. 




157 




































































































158 



2 . 


Test Bench 


The KDLX clock high and low times are each 50 ns. The input setup time and 
output valid delay are each 10 ns. The Interrupt, ROM and RAM all run in double speed 
with a clock high and low time of 25 ns. The setup time and hold times are each 3 ns. 
Generate an error in the test bench at time 900 ns to check the function of the state ma¬ 
chine. This test bench stops at time 4900 ns. 
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3. Memory Pre-configuration and Results 

The highlighted Opcode is where an error occurs in the test bench. Contents in 
the Instruction Mem and the upper half data of the Data Mem are pre-configured. Regis¬ 
ters and the lower half data of the Data Mem are the final values after the simulation is 
done. 
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Simulation Result 
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K. THE FULL DESIGN WITHOUT ESSD 
1. Schematic 

Three RAMs are used to provide inconsistent data to TMRA. This schematic is 
designed for simulating the circumstance at the occurrence of an error. The real design 
needs only one RAM and does not have to triplicate the instruction and data buses. 
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2 . 


Test Bench 


The clock high and low times for KDLX, Reconciler, Interrupt, and memory are 
50 ns, 25 ns, 25 ns, and 25 ns, respectively. The input setup times and output valid de¬ 
lays for KDLX, Reconciler, Interrupt, and memory are 8 ns, 9 ns, 9 ns, and 10 ns, respec¬ 
tively. The ending point of this test bench is at 4900 ns. 

The signals between clk_i and clkjn are associated with the Interrupt clock cycle. 
The signals between clkjn and elk_p are associated with the memory clock cycle. Each 
signal in simulation has to be associated with one clock. 
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3. Memory Pre-configurations 
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Simulation Result 
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5. Zoom-in Figures of cid_l and cid_0 
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L. THE FULL DESIGN WITH ESSD 
1. Schematic 

The ESSD intercepts all connections on RAMs when the error syndromes are be¬ 
ing stored. The clock for Interrupt and Reconciler are wired together since they work in 
parallel. 
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2. Test Bench 

The clock high and low times for KDLX, latch51, Reconciler (or Interrupt), 
ESSD, and memory are 50 ns, 25 ns, 25 ns, 25 ns, and 25 ns, respectively. The input 
setup times and output valid delays for KDLX, latch51. Reconciler (or Interrupt), ESSD, 
and memory are 8 ns, 8 ns, 9 ns, 9 ns, and 10 ns, respectively. The test bench ends at 
time 4900 ns. 
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APPENDIX B: KDLX INSTRUCTION SET DESCRIPTION 


This appendix lists all of the operation codes and functions of the instructions 
used in the KDLX. This reference was originally contained in Dr. Kenneth Clark’s dis¬ 
sertation [8]. Some errors were found and have been checked with the author. The func¬ 
tion of the correct operation codes has been proved in the simulations of this thesis. The 
operation description is revised in order to give a clear discription of how data transfers. 

Some symbols used in this appendix need to be introduced first. Rsl represents 
one of the 15 registers in KDLX. Rs2 represents one of the 15 registers in KDLX as 
well. Rsl and Rs2 could be the same register. Rd represents one of the 15 registers in 
KDLX used as a destination register. Immedy represents the most significant bit of a 7- 

o 

bit immediate value. [(Immedy) || limned] represents an 7-bit immediate value being 
sign extended to 16-bit long. 


Instruction: ADD (Register Add) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x01 

Rsl 

Rd 

Rs2 

Unused 


Usage: ADD Rd, Rsl, Rs2 


Operation: Rd <— (Rsl+Rs2) 


Instruction: ADDI (Add Immediate) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x01 

Rsl 

Rd 

Rs2 

Unused 


Usage: ADDI Rd, Rsl, limned 


o 

Operation: Rd <— (Rsl+[(Immedy) || limned]) 


Instruction: ADDUI (Add Unsigned Immediate) 


23_ 20 19 

Opcode: 0x21 


16 

15 12 

11 8 

i 

4 3 


Rsl 

Rd 

Immed 


o 


Usage: ADDUI Rd, Rsl, limned 


Operation: Rd <— (Rsl+[(0) 8 1| Immed]) 
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Instruction: AND (Register AND) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x09 

Rsl 

Rd 

Rs2 

Unused 


Usage: AND Rd, Rsl, Rs2 
Operation: Rd <— (Rs 1 (logical-and) Rs2) 


Instruction: ANDI (AND Immediate) 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x29 

Rsl 

Rd 

limned 



Usage: AND Rd, Rsl, Immed 

o 

Operation: Rd <— (Rsl (logical-and) [(Immed 7 ) || Immed]) 


Instruction: BEQZ (Branch if Equal to Zero) 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: OxCl 

Rsl 

Unused 

limned 


Usage: BEQZ Rsl, Immed 

8 

Operation: If Rsl=0, then Program_Address <— (PC+1+[(Immed?) || limned]) 


Instruction: BNEZ (Branch if Not Equal to Zero) 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x00 

Rsl 

Unused 

limned 



Usage: BNEZ Rsl, Immed 

8 

Operation: If Rsl^O, then Program_Address <— (PC+l+[(Immed 7 ) || limned]) 


Instruction: J (Jump) 


23 20 19 16 

15 12 11 8 7 4 3 0 

Opcode: 0xC8 

limned 


Usage: J Immed 


Operation: Program Address <— Immed 
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Instruction: JAL (Jump and Link) 


23 20 19 16 

15 12 11 8 7 4 3 0 

Opcode: 0xE8 

limned 


Usage: JAL limned 


Operation: ProgramAddr <— Immed; 

R15 <— LinkProgramAddress 


Instruction: JALR (Jump Register and Link) 


23 20 19 16 

15 12 

11 8 7 4 3 0 

Opcode: 0x68 

Rsl 

Unused 


Usage: JALR Rsl 

Operation: Program Addr <— (Rsl); 

R15 <— Link Program Address 


Instruction: JR (Jump Register) 


23 20 19 16 

15 12 

11 8 7 4 3 0 

Opcode: 0x48 

Rsl 

Unused 


Usage: JALR Rsl 

Operation: Program Address <— (Rsl) 


Instruction: LHI (Load High Immediate) 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x08 

Unused 

Rd 

limned 


Usage: LHI Rd, Immed 


o 

Operation: Rd <— Immed || (0) 


Instruction: LW (Load Word) 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x44 

Rsl 

Rd 

limned 



Usage: LW Rd, Rsl (Immed) 

8 

Operation: Rd <— Mem{Rsl+[(Immed 7 ) || limned]} 


175 








Instruction: NOP (No Operation) 


23 20 19 16 

15 12 11 8 7 4 3 0 

Opcode: 0x00 

Unused 


Usage: NOP 
Operation: None 


Instruction: OR (Register OR) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x2A 

Rsl 

Rd 

Rs2 

Unused 


Usage: OR Rd, Rsl, Rs2 

Operation: Rd <— (Rs 1 (logical-or) Rs2) 


Instruction: ORI (OR Immediate) 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x2A 

Rsl 

Rd 

limned 



Usage: ORI Rd, Rsl, Innned 
Operation: Rd <— (Rsl (logical-or) limned) 

Instruction: RFE (Return from Exception) _ 


23 20 19 16 

15 12 11 8 7 4 3 0 

Opcode: 0xF8 

Unused 


Usage: RFE 

Operation: ProgramAddress <— Interrupt Address Register 


Instruction: SEQ (Set if Equal) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x18 

Rsl 

Rd 

Rs2 

Unused 


Usage: SEQ Rd, Rsl, Rs2 

Operation: If Rsl=Rs2, then Rd=0x0001 else Rd=0x0000 
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Instruction: SEQI (Set Equal Immediate) 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x58 

Rsl 

Rd 

limned 


Usage: SEQI Rd, Rsl, Innned 

Operation: If Rsl=[(Immed 7 ) 8 1| limned], then Rd=0x0001 else Rd=0x0000 
Instruction: SGE (Set if Greater Than or Equal) ___ 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x19 

Rsl 

Rd 

Rs2 

Unused 


Usage: SGE Rd, Rsl, Rs2 

Operation: If Rsl > Rs2, then Rd=0x0001 else Rd=0x0000 

Instruction: SGEI (Set if Greater Than or Equal Immediate) _ 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x59 

Rsl 

Rd 

limned 


Usage: SGEI Rd, Rsl, Immed 

Operation: If Rsl > [(Immed?) 8 1| Immed], then Rd=0x0001 else Rd=0x0000 


Instruction: SGT (Set if Greater Than) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: OxlA 

Rsl 

Rd 

Rs2 

Unused 


Usage: SGT Rd, Rsl, Rs2 

Operation: If Rsl>Rs2, then Rd=0x0001 else Rd=0x0000 

Instruction: SGTI (Set if Greater Than Immediate) __ 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x5A 

Rsl 

Rd 

limned 



Usage: SGTI Rd, Rsl, Immed 

Operation: If Rsl>[(Immed?) 8 j| Immed], then Rd=0x0001 else Rd=0x0000 


177 





Instruction: SLE (Set if Less r 

rhan or Equal) 

23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: OxlB 

Rsl 

Rd 

Rs2 

Unused 


Usage: SLE Rd, Rsl, Rs2 


Operation: If Rsl < Rs2, then Rd=0x0001 else Rd=0x0000 


Instruction: SLEI (Set if Less Than or Equal ] 

immediate) 

23 20 19 16 

15 12 

11 8 

7 

4 3 

_oj 

Opcode: 0x5B 

Rsl 

Rd 

limned 


Usage: SLEI Rd, Rsl, limned 


Operation: If Rsl < [(Iminedy) 8 1| limned], then Rd=0x0001 else Rd=0x0000 


Instruction: SLL (Shift Logic Left) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x11 

Rsl 

Rd 

Rs2 

Unused 


Usage: SLL Rd, Rsl, Rs2 


Operation: Rd <— (Rsl) shifted left by Rs2(3:0) bits 


Instruction: SLLI (Shift Logic Left Immediate) 


23 


20 19 

Opcode: 0x51 


16 

15 12 

11 8 

i 

4 3 


Rsl 

Rd 

limned 


o 


Usage: SLLI Rd, Rsl, limned 


Operation: Rd <— (Rsl) shifted left by Immed(3:0) bits 


Instruction: SLT (Set if Less r 

rhan) 

23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: Ox 1C 

Rsl 

Rd 

Rs2 

Unused 


Usage: SLT Rd, Rsl, Rs2 


Operation: If Rsl<Rs2, then Rd=0x0001 else Rd=0x0000 
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Instruction: SLTI (Set if Less Than Immediate) 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x5C 

Rsl 

Rd 

limned 



Usage: SLTI Rd, Rsl, limned 

Operation: If Rsl<[(Immed 7 ) 8 1| limned], then Rd=0x0001 else Rd=0x0000 


Instruction: SNE (Set if Not Equal) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: OxlD 

Rsl 

Rd 

Rs2 

Unused 


Usage: SNE Rd, Rsl, Rs2 

Operation: If Rsl^Rs2, then Rd=0x0001 else Rd=0x0000 

Instruction: SNEI (Set if Not Equal Immediate) __ 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x58 

Rsl 

Rd 

limned 


Usage: SNEI Rd, Rsl, limned 

Operation: If Rsl^Immed?) 8 1| limned], then Rd=0x0001 else Rd=0x0000 
Instruction: SRA (Shift Right Arithmetic) ___ 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x13 

Rsl 

Rd 

Rs2 

Unused 


Usage: SRA Rd, Rsl, Rs2 

Operation: Rd <— (Rsl) shifted by Rs2(3:0) bits, with Rsl(15) shifted in from 
right (for sign extension) 

Instruction: SRAI (Shift Right Arithmetic Immediate) __ 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x53 

Rsl 

Rd 

Immed 


Usage: SRAI Rd, Rsl, limned 

Operation: Rd <— (Rsl) shifted by Immed(3:0) bits, with Rsl(15) shifted in from 
right (for sign extension) 


179 





Instruction: SRL (Shift Right Logical) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x12 

Rsl 

Rd 

Rs2 

Unused 


Usage: SRL Rd, Rsl, Rs2 

Operation: Rd <— (Rsl) shifted by Rs2(3:0) bits, with 0’s shifted in from right 


Instruction: SRLI (Shift Right Logical Immediate) 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x52 

Rsl 

Rd 

limned 


Usage: SRLI Rd, Rsl, limned 


Operation: Rd <— (Rsl) shifted by Immed(3:0) bits, with 0’s shifted in from right 


Instruction: SUB (Register Subtract) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: 0x03 

Rsl 

Rd 

Rs2 

Unused 


Usage: SUB Rd, Rsl, Rs2 


Operation: Rd <— (Rsl-Rs2) 

Instruction: SUBI (Subtract Immediate) 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x43 

Rsl 

Rd 

limned 



Usage: SUB Rd, Rsl, limned 


o 

Operation: Rd <— (Rsl-[(Immed 7 ) || limned]) 


Instruction: SUBUI (Subtract Unsigned Immediate) 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x23 

Rsl 

Rd 

Immed 


Usage: SUBUI Rd, Rsl, limned 


Operation: Rd <— (Rsl-[(0) 8 1| Immed]) 
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Instruction: SW (Store Word) 


23 

20 19 

16 15 

12 11 87 

4 3 

0 


Opcode: 0x45 

Rsl 

Rd 

limned 



Usage: SW Rs2, Rsl(Immed) 

Operation: Mem{Rsl+[(Immed 7 ) 8 1| limned]} <— Rs2 


Instruction: TRAP (Software Trap) 


23 20 19 16 

15 12 11 8 7 4 3 0 

Opcode: 0x28 

Unused 


Usage: Trap limned 


Operation: ProgramAddress <— limned; 

Interrupt Address Register <— LinkProgramAddress 


Instruction: XOR (Register Exclusive-OR) 


23 20 19 16 

15 12 

11 8 

7 4 

3 0 

Opcode: OxOB 

Rsl 

Rd 

Rs2 

Unused 


Usage: XOR Rd, Rsl, Rs2 

Operation: Rd <— (Rs 1 (exclusive-or) Rs2) 


23 20 19 16 

15 12 

11 8 

7 4 3 0 

Opcode: 0x2B 

Rsl 

Rd 

Immed 


Usage: XORI Rd, Rsl, limned 


Operation: Rd <— (Rs 1 (exclusive-or) limned) 
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APPENDIX C: VHDL CODE 


RECONCILER 


Module: Reconciler 

Function: The Reconciler is used as an interface between the KDLX 
and memory. It runs two times faster than the KDLX. 

Author: Rong Yuan, TWAF 

Date: Nov 14, 2003 


library IEEE; 

use IEEE.STD_LOGIC_1164.ALL; 
use IEEE.STD_LOGIC_ARITH.ALL; 
use IEEE.STD LOGIC UNSIGNED.ALL; 


entity rec is Port ( 

elk r: in std logic; 
reset_r: in std_logic; 
rd r: in std logic; 
wr r: in std logic; 

addrin r: in std logic vector(15 downto 0); 
pc_r: in std_logic_vector(15 downto 0) ; 
datain r: in std logic vector(23 downto 0) ; 
addrout_r: out std_logic_vector(15 downto 0); 
instr_data: out std_logic_vector(23 downto 0); 
dataout_r: out std_logic_vector(23 downto 0) ; 
mem data: inout std logic vector(15 downto 0) ; 
wrout_r: out std_logic; 

state_r: out std_logic_vector(3 downto 0) 

) ; 

end rec; 

architecture fsm of rec is -- fsm is Finite State Machine 

type targetFSM is (State, StateO, Statel, ReadState, WriteState) ; 
signal currState, nextState: targetFSM; 


begin 

nxtStProc: process ( currState, rd_r, wr_r) 
begin 
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case currState is 
when State => 

nextState <= StateO; 
when StateO => 

if (rd r='0' and wr r='1') then -- read from memory 
nextState <= ReadState; 

elsif (rd r='1' and wr r='0') then -- write to memory 
nextState <= WriteState; 

else 

nextState <= Statel; 
end if; 

when Statel => 

nextState <= StateO; 
when ReadState => 

nextState <= StateO; 
when WriteState => 

nextState <= StateO; 

end case; 

end process nxtStProc; 


-- Process to register the current state 
curStProc: process (clk_r, reset_r) 
begin 

if (reset r ='0') then 
currState <= State; 

elsif (elk r'event and elk r='1') then 
currState <= nextState; 
end if; 

end process curStProc; 


-- Process to generate outputs 

outConProc: process (currState, wr_r, pc_r, datain_r, addrin_r, 
mem_data) 

begin 


case currState is 
when State => 
null; 

starts at Statel after reset 


-- generated for reset only 
-- without this state, state machine 


to KDLX 


when StateO => -- doing instruction fetch 

state_r <= "0000"; 
wrout r <= wr r; 

addrout r <= pc r; -- sending pc to memory 

instr data <= datain r; -- memory sends instruction 


dataout_r <= (others => 'Z'); 
mem data <= (others => 'Z'); 
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when Statel => -- exactly the same as StateO 

-- for keeping current state 

state_r <= "0001"; 
wrout r <= wr r; 
addrout_r <= pc_r; 
instr data <= datain r; 
dataout_r <= (others => ' Z ' ) ; 
mem data <= (others => 'Z') ; 

when ReadState => -- When KDLX reads data from memory 

state_r <= "0010"; 

wrout r <= wr r; -- write signal is one 

addrout r <= addrin r; -- sending address to memory 

mem data <= datain r(15 downto 0); 

-- memory sends data to KDLX 
dataout r <= (others => 'Z ' ); -- block input to memory 

when WriteState => -- When KDLX writes data to memory 

state_r <= "0011"; 

wrout r <= wr r; -- write signal is zero 

addrout r <= addrin r; -- sending address to memory 

dataout_r(15 downto 0) <= mem_data; 

-- KDLX sends data to memory 
dataout_r(23 downto 16) <= "00000000"; 

-- sign extension data 


end case; 

end process outConProc; 
end fsm; 
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B. 


INTERRUPT 


-- Module: Interrupt 

-- Function: The Interrupt is used to switch to ISR when err occurs. 

-- It runs in double speed and has the same time constraints with 
-- Reconciler. TRAP to other instruction set and jump back when done. 

-- Notation: This Interrupt is revised to work with TMRA in this design 
-- only. This is the final version before ESSD is generated. Only two 
-- NOPs after TRAP. 

-- Author: Rong Yuan, TWAF 

-- Date: Nov 17, 2003 


library IEEE; 

use IEEE.STD_LOGIC_1164.ALL; 
use IEEE.STD_LOGIC_ARITH.ALL; 
use IEEE.STD LOGIC UNSIGNED.ALL; 


entity Interrupt is Port ( 

rfe i: in std logic vector(23 downto 0) ; 
pc in: in std logic vector(15 downto 0); 
err: in std logic; 
reset_i: in std_logic; 
elk i: in std logic; 

pc_out: out std_logic_vector(15 downto 0) ; 
sel_i: out std_logic_vector(23 downto 0) ; 
trap_i: out std_logic_vector(23 downto 0); 
state_i: out std_logic_vector(3 downto 0) 

) ; 

end Interrupt; 

architecture fsm of Interrupt is 

type targetFSM is (State, State0_A, State0_B, TrapState_A, TrapState_B, 

NopState0_A, NopState0_B, NopStatel_A, NopStatel_B, 
WaitState_A, WaitState_B, BackState_A, BackState_B); 

signal pc latch: std logic vector(15 downto 0); 
signal new instr: std logic vector(23 downto 0) ; 
signal currState, nextState: targetFSM; 

begin 

nxtStProc: process ( currState, err, rfe_i) 
begin 
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case currState is 


when State => 

nextState <= StateO_A; 

when StateO A => 

nextState <= StateO_B; 

when StateO B => 
if (err='l') then 

nextState <= TrapState_A; 

else 

nextState <= StateO_A; 
end if; 

when TrapState A => 

nextState <= TrapState_B; 

when TrapState B => 

nextState <= NopStateO_A; 

when NopStateO A => 

nextState <= NopStateO_B; 

when NopStateO B => 

nextState <= NopStatel A; 

when NopStatel A => 

nextState <= NopStatel B; 

when NopStatel B => 

nextState <= WaitState_A; 

when WaitState A => 

nextState <= WaitState B; 


then -- check F80000 


-- stay if not seeing F80000 


when BackState A => 

nextState <= BackState B; 

when BackState B => 

nextState <= StateO_A; 

end case; 

end process nxtStProc; 

-- Process to register the current state 
curStProc: process (clk_i, reset_i) 
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when WaitState B => 

if (rfe i(23 downto 16)="11111000") 
nextState <= BackState A; 

else 

nextState <= WaitState_A; 
end if; 



begin 


if (reset i ='0') then 
currState <= State; 

elsif (elk i'event and elk i='l') then 
currState <= nextState; 
end if; 

end process curStProc; 

-- Process to generate outputs 
outConProc: process (currState, pc_in) 
begin 

case currState is 
when State => 
null; 

when StateO A => 

state_i <= "0000"; 

trap_i <= (others =>'Z'); 

sel i <= "111111111111111111111111"; 

pc_out <= (others => 'Z'); 

when StateO B => 

state_i <= "0001"; 

trap_i <= (others =>'Z'); 

sel i <= "111111111111111111111111"; 

pc_out <= (others => 'Z'); 

when TrapState A => 

state_i <= "0010"; 

sel i <= "000000000000000000000000"; --allow TRAP pass to KDLX 
trap^i <= "001010000000000000110000"; --TRAP instr 2800030 
pc latch <= pc in; --latch pc for new instruction 

when TrapState B => 

state_i <= "0011"; 

sel_i <= "000000000000000000000000"; 

pc out <= pc_latch; --show latched pc on bus 

when NopStateO A => 

state_i <= "0100"; 

trap_i <= "000000000000000000000000"; --allow NOP to KDLX 

seli <= "000000000000000000000000"; 
pc_out <= (others => 'Z'); 

when NopStateO B => 

state_i <= "0101"; 

sel_i <= "000000000000000000000000"; —allow NOP to KDLX 

pc_out <= (others => 'Z') ; 

when NopStatel A => 

state i <= "0110"; 
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trap_i <= "000000000000000000000000"; 
seli <= "000000000000000000000000"; 
pc_out <= (others => 'Z'); 

—construct new JUMP instr 

new instr(23 downto 16) <= "11001000"; 

new instr(15 downto 0) <= pc latch; --JUMP is C8+pc 

when NopStatel B => 

state_i <= "0111"; 

sel_i <= "000000000000000000000000"; 
pc_out <= (others => * Z'); 

when WaitState A => 

state_i <= "1000"; 

trap_i <= (others => 'Z'); 

sel_i <= "111111111111111111111111"; 

pc_out <= (others => 'Z'); 

when WaitState B => 

state_i <= "1001"; 

trap_i <= (others => 'Z'); 

sel i <= "111111111111111111111111"; 

pc_out <= (others => 'Z'); 

when BackState A => 

state_i <= "1010"; 

trap i <= new instr; --allow new JUMP to KDLX 

sel_i <= "000000000000000000000000"; 
pc_out <= (others => 'Z'); 

when BackState B => 

state_i <= "1011"; 

sel_i <= "000000000000000000000000"; 
pc_out <= (others => 'Z'); 

end case; 

end process outConProc; 
end fsm; 
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c. 


RECONCILER FOR THE FULL DESIGN 


-- Module: Reconciler 

-- Function: The Reconciler is used as an interface between TMRA and 
-- memory. It runs in double speed. Act as instruction memory in the 
-- first half KDLX clock and as data memory in the second half KDLX 
-- clock. 

-- Notation: This Reconciler is revised to work with the TMRA in this 
-- design only. Data buses are triplicated. 

-- Author: Rong Yuan, TWAF 

-- Date: Nov 14, 2003 


library IEEE; 

use IEEE.STD_LOGIC_1164.ALL; 
use IEEE.STD_LOGIC_ARITH.ALL; 
use IEEE.STD LOGIC UNSIGNED.ALL; 


entity rec2 is Port ( 

elk r: in std logic; 
reset^r: in std_logic; 
rd r: in std logic; 
wr r: in std logic; 

addrin r: in std logic vector(15 downto 0); 
pc_r: in std_logic_vector(15 downto 0) ; 
datain a: in std logic_vector(23 downto 0) ; 

datain b: in std logic vector(23 downto 0) ; 

datain c: in std logic vector(23 downto 0) ; 

addrout_r: out std_logic_vector(15 downto 0) ; 
instr_data_a: out std_logic_vector(23 downto 0) ; 
instr_data_b: out std_logic_vector(23 downto 0); 
instr_data_c: out std_logic_vector(23 downto 0) ; 
dataout_r: out std_logic_vector(23 downto 0) ; 
mem_data_a: out std_logic_vector(15 downto 0); 

-- data from mem to KDLX 
mem_data_b: out std_logic_vector(15 downto 0) ; 
mem_data_c: out std_logic_vector(15 downto 0); 
mem data wr: in std logic vector(15 downto 0) ; 

-- data from KDLX to mem 

wrout^r: out std_logic; 

state_r: out std_logic_vector(3 downto 0) 

) ; 

end rec2; 

architecture fsm of rec2 is -- fsm is Finite State Machine 
type targetFSM is (State, StateO, Statel, ReadState, WriteState); 
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signal currState, nextState: targetFSM; 
begin 

nxtStProc: process ( currState, rd_r, wr_r) 
begin 

case currState is 

when State => 

nextState <= StateO; 

read from memory 
write to memory 

when Statel => 

nextState <= StateO; 

when ReadState => 

nextState <= StateO; 

when WriteState => 

nextState <= StateO; 


when StateO => 

if (rd r='0' and wr r='1') then 
nextState <= ReadState; 
elsif (rd r='1' and wr r='0') then 
nextState <= WriteState; 

else 

nextState <= Statel; 
end if; 


end case; 

end process nxtStProc; 

-- Process to register the current state 
curStProc: process (clk_r, reset_r) 
begin 


if (reset r ='0') then 
currState <= State; 

elsif (elk r'event and elk r='1') then 
currState <= nextState; 
end if; 

end process curStProc; 

-- Process to generate outputs 

outConProc: process (currState, wr_r, pc_r, datain_a, datain_b, 

datain c, addrin r, mem data wr) 


begin 

case currState is 
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without this state, state machine starts at Statel after reset 


when State => -- generated for reset only 

null; 

when StateO => -- doing instruction fetch 

state_r <= "0000"; 
wrout r <= wr r; 

addrout r <= pc r; -- sending pc to memory 

if (datain a(23 downto 16)="11111000") then 

instr_data_a <= "000000000000000000000000"; 
instr_data_b <= "000000000000000000000000"; 
instr_data_c <= "000000000000000000000000"; 

else 

instr data a <= datain a;-- memory sends instruction to KDLX 
instr data b <= datain b; 
instr_data_c <= datain_c; 
end if; 

dataout_r <= (others => ' Z ' ) ; 
mem data_a <= (others => 'Z') ; 
mem data b <= (others => 'Z') ; 
mem data c <= (others => 'Z'); 

when Statel => 

state_r <= "0001"; 
wrout r <= wr r; 
addrout_r <= pc_r; 

if (datain_a(23 downto 16)="11111000") then 

instr_data_a <= "000000000000000000000000"; 
instr_data_b <= "000000000000000000000000"; 
instr_data_c <= "000000000000000000000000"; 
else -- memory sends instruction to KDLX 

instr_data_a <= datain_a; 
instr data b <= datain b; 
instr_data_c <= datain_c; 
end if; 

dataout_r <= (others => ' Z ' ) ; 
mem data a <= (others => 'Z') ; 

mem data b <= (others => 'Z') ; 

mem data c <= (others => 'Z'); 


when ReadState => 

state_r <= "0010"; 
wrout r <= wr r; 
addrout_r <= addrin_r; 

-- memory sends data to KDLX 
mem_jdata_a <= datain_a(15 downto 0) ; 
mem data b <= datain b(15 downto 0); 
mem_jdata_c <= datain_c(15 downto 0) ; 
dataout_r <= (others => 'Z' ) ; 
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-- When KDLX reads data from memory 

-- write signal is one 
-- sending address to memory 


-- exactly the same as StateO 
-- for keeping current state 


block input to memory 



When KDLX writes data to memory 


when WriteState => 

state_r <= "0011"; 
wrout r <= wr r; -- write signal is zero 

addrout r <= addrin r; -- sending address to memory 

-- KDLX sends data to memory 
dataout r(15 downto 0) <= mem data wr; 

dataout_r(23 downto 16) <= "00000000"; -- sign extension data 

end case; 

end process outConProc; 
end fsm; 
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D. ESSD 


-- Module: Error Syndrome Storage Device (ESSD) 

-- Function: The ESSD is used to store error syndrome when err occurs. 
-- It runs in double speed and has the same time constraints with 
-- Reconciler. Stall KDLX at the beginning of ISR. 

-- Notation: This ESSD works with the TMRA in this design only. This 
-- is the final version. 

-- Author: Rong Yuan, TWAF 

-- Date: Nov 21, 2003 

_ ic***ic*i?*i?*i?&ic*i?*ic*'k-k'k'k-k'k'k'k'k'k'k'k'k'k'k'k-k-k'k'k-k-k'k'k'k-k'k'k-k'k'k'k'k'k'k'k-k'k'k'k'k'k'k'k'k'k'k'k'k-k'k 


library IEEE; 

use IEEE.STD_LOGIC_1164.ALL; 
use IEEE.STD_LOGIC_ARITH.ALL; 
use IEEE.STD LOGIC UNSIGNED.ALL; 


entity essd is Port ( 

addr in: in std logic vector(15 downto 0) ; 
pc in: in std logic vector(15 downto 0); 
cidl in: in std logic vector(50 downto 0); 
cidO in: in std logic vector(50 downto 0) ; 
err: in std logic; 
resets: in std_logic; 
elk s: in std logic; 

stall_s: out std_logic; 
wr_s: out std_logic; 
sel_wr: out std_logic; 

addr_s: out std_logic_vector(15 downto 0) ; 
sel_addr: out std_logic_vector(15 downto 0); 
sel_s: out std_logic_vector(23 downto 0) ; 
ess: out std_logic_vector(23 downto 0) ; 
state_s: out std_logic_vector(4 downto 0) 

) ; 

end essd; 

architecture fsm of essd is 

type targetFSM is (State, State0_A, State0_B, LatchState_A, 

LatchState_B, NopState0_A, NopState0_B, NopStatel_A, 
NopStatel_B, StallState, StoreState0_A, 
StoreState0_B, StoreState0_C, StoreStatel_A, 
StoreStatel_B, StoreStatel_C, StoreState_addr, 
StoreState_pc, BackState); 

signal pc latch, addr latch: std logic_vector(15 downto 0); 
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signal cidO latchA, cidO latchB, cidO latchC, cidl latchA, cidl latchB, 
cidl_latchC: std_logic_vector(23 downto 0); 
signal counter: std logic_vector(15 downto 0); 
signal currState, nextState: targetFSM; 

begin 

nxtStProc: process ( currState, err) 
begin 

case currState is 

when State => 

nextState <= StateO_A; 

when StateO A => 

nextState <= StateO_B; 

when StateO B => 
if (err='l') then 

nextState <= LatchState_A; 

else 

nextState <= StateO_A; 
end if; 

when LatchState A => 

nextState <= LatchState_B; 

when LatchState B => 

nextState <= NopStateO_A; 

when NopStateO A => 

nextState <= NopStateO_B; 

when NopStateO B => 

nextState <= NopStatel A; 

when NopStatel A => 

nextState <= NopStatel B; 

when NopStatel B => 

nextState <= StallState; 

when StallState => 

nextState <= StoreStateO_A; 

when StoreStateO A => 

nextState <= StoreStateO_B; 

when StoreStateO B => 

nextState <= StoreStateO_C; 

when StoreStateO_C => 

nextState <= StoreStatel A; 
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when StoreStatel A => 

nextState <= StoreStatel_B; 

when StoreStatel B => 

nextState <= StoreStatel_C; 

when StoreStatel C => 

nextState <= StoreState_addr; 

when StoreState addr => 

nextState <= StoreState_pc; 

when StoreState pc => 

nextState <= BackState; 

when BackState => 

nextState <= StateO_A; 

end case; 

end process nxtStProc; 

-- Process to register the current state 
curStProc: process (clk_s, reset_s) 
begin 


if (reset_s ='0') then 
currState <= State; 

elsif (elk s'event and elk s='l') then 
currState <= nextState; 
end if; 

end process curStProc; 

-- Process to generate outputs 

outConProc: process (currState, pc_in, addr^in, cidl_in, cidO_in) 
begin 

counter <= "0000000001011001"; --starting at address 0059 

case currState is 
when State => 
null; 

when StateO A => 

state_s <= "00000"; 

ess <= (others =>'Z'); 

sel s <= "111111111111111111111111"; 

sel wr <= '1'; 

sel~addr <= "1111111111111111"; 
stall_s <= '1'; 

when StateO B => 

state_s <= "00001"; 
ess <= (others =>'Z'); 
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sel_s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel~addr <= "1111111111111111"; 
stall_s <= '1'; 

when LatchState A => --latch all data here 

state_s <= "00010"; 

sel_s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel~addr <= "1111111111111111"; 
stall_s <= '1'; 
pc_latch <= pc in; 
addr latch <= addr in; 

--seperate input data 

cidl latchC <= cidl in(23 downto 0); 

cidl latchB <= cidl in(47 downto 24); 

cidl latchA(2 downto 0) <= cidl in(50 downto 48); 

cidl_latchA(23 downto 3) <= "000000000000000000000"; 

cidO latchC <= cidO in(23 downto 0); 

cidO latchB <= cidO in(47 downto 24); 

cidO latchA(2 downto 0) <= cidO in(50 downto 48); 

cid0_latchA(23 downto 3) <= "000000000000000000000"; 

when LatchState B => 

state_s <= "00011"; 

sel_s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel~addr <= "1111111111111111"; 
stall_s <= '1'; 

when NopStateO A => 

state_s <= "00100"; 

sel_s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel^addr <= "1111111111111111"; 
stall_s <= '1'; 

when NopStateO B => 

state_s <= "00101"; 

sel_s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel~addr <= "1111111111111111"; 
stall_s <= '1'; 

when NopStatel A => 

state_s <= "00110"; 

sel_s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel_addr <= "1111111111111111"; 
stall_s <= '1'; 

when NopStatel B => 

state_s <= "00111"; 

sel_s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel~addr <= "1111111111111111"; 
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stall s <= 


1 '; 


when StallState => --stall KDLX 

state_s <= "01000"; 

sel s <= "111111111111111111111111" 
sel wr <= '1'; 

sel~addr <= "1111111111111111"; 
stall_s <= 'O'; 

when StoreStateO A => --store cidO 

state_s <= "01001"; 

sel_s <= "000000000000000000000000" 
sel wr <= 'O'; 

sel_addr <= "0000000000000000"; 

stall_s <= 'O'; 

addr_s <= counter; 

wr s <= 'O'; 

ess <= cid0_latchC; 

counter <= counter-1; 

when StoreStateO B => 

state_s <= "01010"; 

sel_s <= "000000000000000000000000" 
sel wr <= 'O'; 

sel_addr <= "0000000000000000"; 

stall_s <= 'O'; 

addr_s <= counter; 

wr s <= 'O'; 

ess <= cid0_latchB; 

counter <= counter-1; 

when StoreStateO_C => 

state_s <= "01011"; 

sel_s <= " 000000000000000000000000 " 
sel wr <= 'O'; 

sel_addr <= "0000000000000000"; 

stall_s <= 'O'; 

addr_s <= counter; 

wr s <= 'O'; 

ess <= cid0_latchA; 

counter <= counter-1; 

when StoreStatel A => --store cidl 

state_s <= "01100"; 

sel_s <= "000000000000000000000000" 
sel wr <= 'O'; 

sel_addr <= "0000000000000000"; 

stall_s <= 'O'; 

addr_s <= counter; 

wr s <= 'O'; 

ess <= cidl^latchC; 

counter <= counter-1; 

when StoreStatel B => 

state_s <= "01101"; 

sel^s <= "000000000000000000000000" 
sel wr <= 'O'; 
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sel_addr <= "0000000000000000"; 

stall_s <= 'O'; 

addr_s <= counter; 

wr s <= 'O'; 

ess <= cidl latchB; 

counter <= counter-1; 

when StoreStatel C => 

state_s <= "OHIO"; 

sel_s <= "000000000000000000000000"; 
sel wr <= 'O'; 

sel_addr <= "0000000000000000"; 

stall_s <= 'O'; 

addr_s <= counter; 

wr s <= 'O'; 

ess <= cidl latchA; 

counter <= counter-1; 

when StoreState addr => --store mem addr 

state_s <= "01111"; 
sel^s <= "000000000000000000000000"; 
sel wr <= 'O'; 

sel_addr <= "0000000000000000"; 
stall_s <= 'O'; 
addr_s <= counter; 
wr s <= 'O'; 

ess (15 downto 0) <= addr_latch; 
ess (23 downto 16) <= "00000000"; 
counter <= counter-1; 

when StoreState_pc => --store pc 

state_s <= "10000"; 

sel^s <= "000000000000000000000000"; 
sel wr <= 'O'; 

sel_addr <= "0000000000000000"; 
stall_s <= 'O'; 
addr_s <= counter; 
wr s <= 'O'; 

ess(15 downto 0) <= pc_latch; 
ess (23 downto 16) <= "00000000"; 
counter <= counter-1; 

when BackState => --release KDLX 

state_s <= "10001"; 

sel s <= "111111111111111111111111"; 
sel wr <= '1'; 

sel~addr <= "1111111111111111"; 

stall_s <= '1'; 

addr_s <= (others =>'Z'); 

wr s <= '1'; 

ess <= (others =>'Z'); 


end case; 

end process outConProc; 
end fsm; 
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E. 


KDLX 


The KDLX is a 16-bit RISC soft-core processor. It is 5-stage pipelined including 
fetch, decode, execute, memory, and write back. The KDLX is coded by Dr. Kenneth 
Clark and following is the construction of the source core in ISE software. 


B @ dlx_testbench (dlx_out. vhd) 

E 0 dlx (dlx.vhd) 

6 @ core (core, vhd) 

B 0 alu (alu.vhd) 

E 0 adder (adder, vhd) 

0 ao22 (AO 22. vhd] 

0 alujogic (alu_logic. vhd) 

B 0 log_barrel (log_barrel. vhd) 

0 word_mux4 (word_mux4.vhd) 

0 word_mux4 (word_mux4.vhd) 

B 0 word_set (word_set.vhd) 

0 zerojest (zero_test.vhd) 

B 0 pc_control (pc_control.vhd) 

0 increment (increment, vhd) 

0 word_mux3 (word_mux3.vhd) 

B 0 word_reg_$ingle (word_reg_single.vhd) 

0 $can_reg (scan_reg.vhd) 

El 0 pipeline (pipeline, vhd) 

B 0 twelve_bit_reg_single (twelve_bit_reg_single.vhd) 

0 scan_reg (scan_reg.vhd) 

B 0 twenty_four_bit_reg_single (twenty_four_bit_reg_$ingle.vhd) 
6 0 twelve_bit_reg_smgle (twelve_bit_reg_single. vhd) 

0 scan_reg (scan_reg.vhd) 

B 0 regfile (regfile. vhd) 

0 dest_decoder (Dest_Decoder.vhd) 

0 word_mux16 (word_mux16. vhd) 

B 0 word_reg_single (word_reg_single.vhd) 

0 scan_reg (scan_reg.vhd) 

0 rw_control (rw_control.vhd) 

0 word_mux3 (word_mux3.vhd) 

0 word_mux4 (word_mux4.vhd) 

B 0 word_reg_single (word_reg_single.vhd) 

0 scan_reg (scan_reg.vhd) 

0 zero_test (zerojest. vhd) 

0 io_pad$ (IO_Pads.vhd) 
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1. alu.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

USE IEEE.std logic arith.all; 

USE IEEE.std logic unsigned.all; 

— ***** adder model ***** 

-- external ports 
ENTITY adder IS PORT ( 

A : IN std logic vector(15 downto 0); 

B: IN std logic vector(15 downto 0) ; 

alu opl : IN std logic; 
alu_op3 : IN std_logic; 
alu op4 : IN std logic; 

Out_word : OUT std_logic_vector(15 downto 0) 

) ; 

END adder; 

-- internal structure 
ARCHITECTURE rtl OF adder IS 

-- COMPONENTS 

COMPONENT A022 
PORT ( 

A : IN std logic; 

B : IN std logic; 

C : IN std^logic; 

D : IN std logic; 

\Out\ : OUT std_logic 

) ; 

END COMPONENT; 

SIGNAL Vdd : std logic; 

SIGNAL subtract : std logic; 

-- INSTANCES 
BEGIN 

Vdd <= '1 ' ; 

A022_l : A022 PORT MAP( 

A => Vdd, 

B => alu opl, 

C => alu_op4, 

D => alu_op3, 

\Out\ => subtract 

) ; 


process (A, B, subtract) 
begin 

if (subtract = '1') then 
out word <= A-B; 
else out word <= A+B; 
end if; 
end process; 

END rtl; 
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2 . 


alu.vhd 


LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** a ]_ u mo del ***** 

-- external ports 
ENTITY alu IS PORT ( 

A : IN std logic vector (15 downto 0); 
alu_op : IN std_logic_vector (4 downto 0); 
alu_out : OUT std_logic_vector (15 downto 0); 

B : IN std logic vector (15 downto 0) 

) ; 

END alu; 

-- internal structure 
ARCHITECTURE structural OF alu IS 

-- COMPONENTS 

COMPONENT adder 
PORT ( 

A : IN std logic vector(15 downto 0) ; 

B : IN std logic vector (15 downto 0) ; 
alu opl : IN std logic; 
alu_op3 : IN std_logic; 
alu op4 : IN std logic; 

Out_word : OUT std_logic_vector (15 downto 0) 

) ; 

END COMPONENT; 

COMPONENT alu^logic 
PORT ( 

A : IN std logic vector (15 downto 0); 

B : IN std logic vector (15 downto 0) ; 

Func : IN std logic_vector (1 downto 0); 
logic_out : OUT std_logic_vector (15 downto 0) 

) ; 

END COMPONENT; 

COMPONENT logjoarrel 
PORT ( 

ar or log : IN std logic; 

In Word : IN std logic vector (15 downto 0); 

1 or r : IN std logic; 

Out_word : OUT std_logic_vector (15 downto 0); 
Shift : IN std logic vector (3 downto 0) 

) ; 

END COMPONENT; 

COMPONENT word_mux4 
PORT ( 


A 

IN 

std 

logic 

vector 

(15 

downto 

0) 

B 

IN 

std 

logic 

vector 

(15 

downto 

0) 

C 

IN 

std 

logic 

vector 

(15 

downto 

o) 

D 

IN 

std 

logic 

vector 

(15 

downto 

0 ); 


Sel : IN std_logic_vector (1 downto 0); 


202 





Out_word : OUT std_logic_vector (15 downto 0) 

) ; 

END COMPONENT; 

COMPONENT word_set 
PORT ( 

In word : IN std logic vector (15 downto 0); 
set_op : IN std_logic_vector (2 downto 0); 
set_out : OUT std_logic 

) ; 

END COMPONENT; 

-- SIGNALS 

SIGNAL set_out : std_logic_vector (15 downto 0) ; 

SIGNAL log barrel out : std logic vector (15 downto 0) 
SIGNAL logic out : std logic vector (15 downto 0); 
SIGNAL Adder Out : std logic vector (15 downto 0); 

-- INSTANCES 
BEGIN 

set_out(15 downto 1) <= "000000000000000"; 
halfword adder 1 : adder PORT MAP( 

A => A, 

alu_opl => alu_op(l), 
alu_op3 => alu_op(3), 
alu_op4 => alu_op(4), 

B => B, 

Out_word => Adder_Out 

) ; 

halfword alu logic 1 : alu logic PORT MAP( 

A => A, 

B => B, 

Func => alu_op(l downto 0), 
logic_out => logic_out 

) ; 

halfword log barrel 1 : log barrel PORT MAP ( 
ar_or_log => alu_op(0). 

In word => A, 
l_or_r => alu_op(l), 

Out_word => log_barrel_out. 

Shift => B(3 downto 0) 

) ; 

halfword mux4 1 : word mux4 PORT MAP( 

A => Adder_Out, 

B => logic_out, 

C => log_barrel_out, 

D => set_out, 

Out_word => alu_out, 

Sel => alu_op(4 downto 3) 

) ; 

halfword set 1 : word set PORT MAP( 

In word => Adder Out, 
set_op => alu_op(2 downto 0), 
set_out => set_out(0) 

) ; 

END structural; 
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3. alulogic.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** a lu_logic model ***** 

-- external ports 

ENTITY alu_logic IS PORT ( 

A: IN std logic vector(15 downto 0); 

B : IN std logic vector(15 downto 0); 

Func: IN std logic vector(1 downto 0); 
logic_out : OUT std_logic_vector(15 downto 0) 

) ; 

END alu logic; 

-- internal structure 
ARCHITECTURE rtl OF alu_logic IS 

BEGIN 

process (A,B, func) 
begin 

case func is 

when "00" => logic out <= A; 
when "01" => logic out <= (A and B); 
when "10" => logic_out <= (A or B) ; 
when others => logic out <= (A xor B); 
end case; 
end process; 

END rtl; 


4. A022.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

entity A022 is port ( 

A, B, C, D: IN std_logic; 

\Out\ : OUT std_logic); 
end A022; 

architecture behavioral of A022 is 
begin 

\Out\ <= (A and B) or (C and D) ; 
end behavioral; 
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5. 


core.vhd 


LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

USE IEEE.std logic arith.all; 

-- ***** core model ***** 

-- external ports 
ENTITY core IS PORT ( 

Addr Int : OUT std logic_vector(15 downto 0); 
Clock in : IN std logic; 

Input Data : IN std logic vector(15 downto 0) 
Output_Data : Out std_logic_vector(15 downto 0); 
Instr : IN std logic vector(23 downto 0); 

PC : OUT std_logic_vector(15 downto 0) ; 
Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 

Resetn : IN std logic; 

Stalin : IN std logic; 

Wr : OUT std logic 

) ; 

END core; 

-- internal structure 
ARCHITECTURE structural OF core IS 

-- COMPONENTS 

COMPONENT alu 
PORT ( 

A : IN std logic vector(15 downto 0) ; 
alu_op : IN std_logic_vector(4 downto 0); 
alu_out : OUT std_logic_vector(15 downto 0); 

B ; IN std logic vector(15 downto 0) 

) ; 

END COMPONENT; 


COMPONENT word_mux3 
PORT ( 

A : IN std logic vector(15 downto 0); 

B : IN std logic vector(15 downto 0); 

C : IN std_logic_vector(15 downto 0) ; 

Out_word : OUT std_logic_vector(15 downto 0); 
Sel : IN std logic vector(1 downto 0) 

) ; 

END COMPONENT; 

COMPONENT word_mux4 
PORT ( 

A : IN std logic vector(15 downto 0); 

B : IN std logic vector(15 downto 0); 

C : IN std_logic_vector(15 downto 0) ; 

D : IN std logic vector(15 downto 0) ; 

Out_word : OUT std_logic_vector(15 downto 0); 
Sel : IN std logic_vector(1 downto 0) 

) ; 

END COMPONENT; 
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COMPONENT regfile 
PORT ( 

A : OUT std_logic_vector(15 downto 0); 

B : OUT std_logic_vector(15 downto 0); 
clock : IN std logic; 

Data In ; IN std logic vector(15 downto 0); 

Dest : IN std_logic_vector(3 downto 0) ; 

stalln: IN std logic; 
resetn : IN std logic; 

RSone : IN std logic vector(3 downto 0) ; 

RStwo : IN std logic vector(3 downto 0) ; 
scan data-in : IN std logic; 
scan enable : IN std logic; 
wb enable : IN std logic 

) ; 

END COMPONENT; 

COMPONENT word reg single 
PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector(15 downto 0); 
Data_out : OUT std_logic_vector(15 downto 0); 
Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END COMPONENT; 

COMPONENT pc_control 
PORT ( 

ALU Out : IN std logic vector(15 downto 0); 
Clock : IN std logic; 

D2 Inc PC : OUT std logic vector(15 downto 0); 

D Link PC : OUT std logic_vector(15 downto 0) 
IAR Enable : IN std logic; 

PC : OUT std_logic_vector(15 downto 0); 

PC Sel : IN std logic vector(1 downto 0); 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan_Data_Out : OUT std^logic; 

Scan Enable : IN std_logic; 

Stalln : IN std logic 

) ; 

END COMPONENT; 

COMPONENT pipeline 
PORT ( 

alu_op : OUT std_logic_vector(4 downto 0) ; 

A Mux : OUT std logic_vector(1 downto 0); 

B Mux : OUT std logic_vector(1 downto 0); 
Clock : IN std logic; 

Data In : IN std logic vector(23 downto 0); 
Dest : OUT std_logic_vector(3 downto 0); 

Immed : OUT std logic vector(15 downto 0); 
PC_Sel : OUT std_logic_vector(1 downto 0); 
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rd_enable : OUT std_logic; 

Reg In Sel : OUT std logic_vector(1 downto 0); 
Resetn : IN std logic; 

RSone : OUT std_logic_vector(3 downto 0); 

RStwo : OUT std_logic_vector(3 downto 0); 

Scan Data In : IN std logic; 

Scan Enable : IN std logic; 

Stalin : IN std logic; 
wb enable : OUT std logic; 
scan^out : OUT std_logic; 

IAR Enable : OUT std logic; 
wr enable : OUT std logic; 
zero flag : IN std logic 

) ; 

END COMPONENT; 

COMPONENT rw_control 
PORT ( 

Clock : IN std logic; 

Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 
rd enable : IN std logic; 
resetn : IN std logic; 
stalln : IN std logic; 

Wr : OUT std_logic; 
wr enable : IN std logic 

) ; 

END COMPONENT; 


COMPONENT zero_test 
PORT ( 

In word : IN std logic vector(15 downto 0) ; 
zero flag : OUT std logic 

) ; 

END COMPONENT; 

-- SIGNALS 

SIGNAL wr enable : std logic; 

SIGNAL zero flag : std logic; 

SIGNAL IAR_Enable : std_logic; 

SIGNAL wb enable : std logic; 

SIGNAL pipeline scan out : std logic; 

SIGNAL Dest : std_logic_vector(3 downto 0); 

SIGNAL A : std logic vector(15 downto 0); 

SIGNAL D2 Inc PC : std logic vector(15 downto 0); 
SIGNAL Immed : std logic vector(15 downto 0); 

SIGNAL D ALU Out : std logic vector(15 downto 0); 
SIGNAL D Link PC : std logic vector(15 downto 0); 
SIGNAL Reg In Sel : std logic_vector(1 downto 0); 
SIGNAL ALU A : std logic vector(15 downto 0); 

SIGNAL ALU Out : std logic vector(15 downto 0); 
SIGNAL ALU B : std logic vector(15 downto 0); 

SIGNAL Gnd : std logic; 

SIGNAL B : std logic vector(15 downto 0); 

SIGNAL LD Memory In : std logic vector(15 downto 0); 
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SIGNAL output en n : std logic; 

SIGNAL rd enable : std logic; 

SIGNAL pc control scan out : std logic; 

SIGNAL But Stalin : std logic; 

SIGNAL But resetn : std logic; 

SIGNAL Clock : std logic; 

SIGNAL Buf Addr Int : std logic vector(15 downto 0); 

SIGNAL Shift En : std logic; 

SIGNAL alu op : std logic vector(4 downto 0); 

SIGNAL Buf Scan Data Out : std logic; 

SIGNAL A Mux : std logic vector(1 downto 0); 

SIGNAL B Mux : std logic vector(1 downto 0); 

SIGNAL RSone : std logic vector(3 downto 0); 

SIGNAL RStwo : std logic vector(3 downto 0); 

SIGNAL PC Sel : std logic vector(1 downto 0); 

SIGNAL Data Out : std logic vector(15 downto 0); 

SIGNAL Regfile In : std logic vector(15 downto 0); 

SIGNAL zero byte : std logic vector(7 downto 0); 

SIGNAL Data In : std logic vector(15 downto 0); 

SIGNAL sign ext immed : std logic vector(15 downto 0); 

SIGNAL scan data in : std logic; 

-- INSTANCES 

BEGIN 

clock <= clock in; 

shift en <= 'O'; 

scan_data_in <= 'O'; 

Addr Int <= Buf Addr Int; 

zero__byte <= "00000000"; 

sign ext immed(15 downto 8) <= Immed(7) & Immed(7) & Immed(7) & 

Immed(7) & Immed(7) & Immed(7) & Immed(7) & Immed(7); 

sign ext immed (7 downto 0) <= Immed(7 downto 0); 

Wr <= output en n; 

Output^Data <= Data_Out; 

Word Reg 1 : word reg single PORT MAP ( 

Clock => Clock, 

Data In => B, 

Data_out => Data_Out, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => pc control scan out. 

Scan Enable => Shift En 

) ; 

Word Reg 2 : word reg single PORT MAP ( 

Clock => Clock, 

Data In => Input Data, 

Data_out => LD Memory In, 

Enable => Stalin, 

Resetn => Resetn, 

Scan_Data_In => Data_Out(15), 

Scan Enable => Shift En 

) ; 
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alu_l : alu PORT MAP( 

A => ALU_A, 
alu_op => alu_op, 
alu_out => ALU_Out, 

B => ALU_B 

) ; 

word mux3 1 : word mux3 PORT MAP( 

A => D_ALU_Out, 

B => LD Memory In, 

C => D_Link_PC, 

Out word => Regfile In, 

Sel => Reg In Sel 

) ; 

word mux3 2 : word mux3 PORT MAP( 

A => B, 

B(7 downto 0) => Immed(7 downto 0), 

B(15 downto 8) => zero_byte, 

C => sign ext immed. 

Out word => ALU B, 

Sel => B Mux 

) ; 

word mux4 1 : word mux4 PORT MAP( 

~~ A => A, 

B => D2_Inc_PC, 

C(7 downto 0) => zero_byte, 

C(15 downto 8) => Immed(7 downto 0), 

D => Immed(15 downto 0), 

Out word => ALU A, 

Sel => A Mux 

) ; 

regfile 1 : regfile PORT MAP( 

A _ => A, 

B => B, 

clock => Clock, 

Data In => regfile in, 

Dest => Dest, 

stalln => stalln, 
resetn => resetn, 

RSone => RSone, 

RStwo => RStwo, 

scan data in => pipeline scan out, 
scan enable => Shift En, 
wb enable => wb enable 

) ; 

word reg single 3 : word reg single PORT MAP( 
Clock => Clock, 

Data In => Buf Addr Int, 

Data out => D ALU Out, 

Enable => Stalln, 

Resetn => resetn. 

Scan Data In => Buf Addr Int(15), 

Scan Enable => Shift En 

) ; 

word reg single 4 : word reg single PORT MAP( 
Clock => Clock, 

Data In => ALU Out, 

Data out => Buf Addr Int, 
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Enable => Stalin, 

Resetn => resetn. 

Scan Data_In => B(15), 

Scan Enable => Shift En 

) ; 

pc control 1 : pc control PORT MAP( 
ALU_Out => ALU_Out, 

Clock => Clock, 

D2_Inc_PC => D2_Inc_PC, 

D_Link_PC => D_Link_PC, 

IAR Enable => IAR Enable, 

PC => PC, 

PC_Sel => PC_Sel, 

Resetn => resetn. 

Scan Data In => D ALU Out(15), 
Scan_Data_Out => pc_control_scan_out. 
Scan Enable => Shift En, 

Stalin => Stalin 

) ; 

pipeline 1 : pipeline PORT MAP( 
alu_op => alu_op, 

A_Mux => A_Mux, 

B_Mux => B_Mux, 

Clock => Clock, 

Data In => Instr, 

Dest => Dest, 

Immed => Immed, 

PC_Sel => PC_Sel, 
rd enable => rd enable, 

Reg In Sel => Reg In Sel, 

Resetn => resetn, 

RSone => RSone, 

RStwo => RStwo, 

Scan Data In => Scan Data In, 

Scan Enable => Shift En, 

Stalin => Stalin, 

wb enable => wb enable, 

scan out => pipeline scan out, 

IAR Enable => IAR Enable, 
wr enable => wr enable, 
zero flag => zero flag 

) ; 

rw control 1 : rw control PORT MAP( 

Clock => Clock, 

Prog Rd => Prog Rd, 

Rd => Rd, 

rd enable => rd enable, 
resetn => resetn, 
stalln => Stalin, 

Wr => output en n, 
wr enable => wr enable 

) ; 

zero test 1 : zero test PORT MAP( 

In word => A, 

zero flag => zero flag 

) ; 

END structural; 
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6. DestDecoder.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** Dest_Decoder model ***** 

-- external ports 

ENTITY Dest_Decoder IS PORT ( 

Dest : IN std_logic_vector(3 downto 0); 
Enable : OUT std logic vector(15 downto 1); 
wb enable : IN std logic 

) ; 

END Dest Decoder; 

-- internal structure 

ARCHITECTURE rtl OF DestJDecoder IS 

-- SIGNALS 

SIGNAL but enable : std logic vector(15 downto 1); 

-- INSTANCES 

BEGIN 

with dest select 

buf enable <= "000000000000001" when "0001", 

"000000000000010" when "0010", 
"000000000000100" when "0011", 
"000000000001000" when "0100", 
"000000000010000" when "0101", 
"000000000100000" when "0110", 
"000000001000000" when "0111", 
"000000010000000" when "1000", 
"000000100000000" when "1001", 
"000001000000000" when "1010", 
"000010000000000" when "1011", 
"000100000000000" when "1100", 
"001000000000000" when "1101", 
"010000000000000" when "1110", 
"100000000000000" when others; 

Enable <= buf enable when (wb enable = '1') else 
" 000000000000000 "; 

END rtl; 
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7. 


dlx.vhd 


LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

USE IEEE.std logic arith.all; 

__ ***** dix model ***** 

-- external ports 

ENTITY dlx IS PORT ( 

Addr Int : OUT std logic vector(15 downto 0); 
Clock in : IN std logic; 

Data : INOUT std_logic_vector(15 downto 0); 
Instr : IN std logic vector(23 downto 0); 

PC : OUT std_logic_vector(15 downto 0); 

Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 

Resetn : IN std logic; 

Stalin : IN std logic; 

Wr : OUT std logic 

) ; 

END dlx; 

-- internal structure 

ARCHITECTURE structural OF dlx IS 

-- COMPONENTS 
COMPONENT core 

PORT ( 

Addr Int : OUT std logic_vector(15 downto 0); 
Clock in : IN std logic; 

Input Data : IN std logic vector(15 downto 0) ; 
Output_Data : Out std_logic_vector(15 downto 0) 
Instr : IN std logic vector(23 downto 0); 

PC : OUT std_logic_vector(15 downto 0); 

Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 

Resetn : IN std logic; 

Stalin : IN std logic; 

Wr : OUT std logic 

) ; 

END COMPONENT; 

COMPONENT IO_Pads 
PORT ( 

Pads : INOUT std_logic_vector (15 downto 0); 
In^Data : OUT std_logic_vector (15 downto 0); 
Out_Data : IN std_logic_vector (15 downto 0); 
Output En n : IN std logic 

) ; 

END COMPONENT; 
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-- SIGNALS 

signal Input data : std logic vector(15 downto 0); 
signal Output_data : std_logic_vector(15 downto 0) 
signal wr int : std logic; 

-- INSTANCES 
BEGIN 

wr <= wr int; 

Corel : core PORT MAP( 

Addr Int => Addr Int, 

Clock in => Clock In, 

Input Data => Input data, 

Output_Data => Output_data, 

Instr => Instr, 

PC => PC, 

Prog Rd => Prog Rd, 

Rd => Rd, 

Resetn => Resetn, 

Stalin => stalln, 

Wr => Wr int 

) ; 


IO_Pads_l : IO_Pads PORT MAP( 
Pads => Data, 

In Data => Input Data, 
Out_Data => Output_Data, 
Output En n => wr int 

) ; 


END structural; 


8. dlx out.vhd 


-- Test bench shell 
library ieee; 

use ieee.std_logic_l164.all; 
use ieee.numeric_std.all; 

entity dlx testbench is end dlx testbench; 
architecture testbench of dlx testbench is 
-- Declaration of the component under test 
component DLX 
port ( 

Addr Int : OUT std logic vector(15 downto 0) 
Clock in : IN std logic; 

Data : INOUT std_logic_vector(15 downto 0) ; 
Instr : IN std logic vector(23 downto 0); 

PC : OUT std_logic_vector(15 downto 0); 
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Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 

Resetn : IN std logic; 

Stalin : IN std logic; 

Wr : OUT std logic 

) ; 

end component; 

signal addr int : std logic vector(15 downto 0); 

signal instr : std logic vector(23 downto 0); 

signal pc : std_logic_vector(15 downto 0) ; 

signal data : std_logic_vector(15 downto 0); 

signal resetn : std logic; 

signal prog rd : std logic; 

signal rd : std logic; 

signal wr : std logic; 

signal stalln : std logic; 

signal clock in : std logic; 

begin 

process - 10 MHz clock 

begin 

clock in <= 'O'; 
wait for 25 ns; 
clock in <= 'O'; 
wait for 25 ns; 
clock in <= '1' ; 
wait for 25 ns; 
clock in <= 'O'; 
wait for 25 ns; 
end process; 


process 

begin - power up reset process 

wait for 1 ns; 

resetn <= 'O'; 
stalln <= '1'; 

wait for 10 ns; 

resetn <= '1'; 
wait; 

end process; 
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process 


begin 






wait for 

1 ns; 





instr <= 

X " 0 0 0 0 0 0 "; 

— 

■ NOP 


data <= 

"ZZZZZZZZZZZZZZZZ 

II . 

1 r 



wait for 

100 ns; 





instr <= 

X " 0 8 0101" ; 

— 

LHI 

\ —1 

#1 

wait for 

100 ns; 





instr <= 

X " 0 8 0 2 0 2 " ; 

— 

LHI 

R2, 

#2 

wait for 

100 ns; 





instr <= 

X " 0 8 0 3 0 3 " ; 

— 

LHI 

R3, 

#3 

wait for 

100 ns; 





instr <= 

X " 0 8 0 4 0 4 " ; 

— 

LHI 

R4, 

#4 

wait for 

100 ns; 





instr <= 

X " 0 8 0 5 0 5 " ; 

— 

LHI 

LO 

PC 

#5 

wait for 

100 ns; 





instr <= 

X " 0 8 0 6 0 6 " ; 

— 

LHI 

R6, 

#6 

wait for 

100 ns; 





instr <= 

X " 0 8 0 7 0 7 " ; 

— 

LHI 

R7, 

#7 

wait for 

100 ns; 





instr <= 

X " 0 8 0 8 0 8 " ; 

— 

LHI 

CO 

PC 

#8 

wait for 

100 ns; 





instr <= 

X " 0 8 0 9 0 9 "; 

— 

LHI 

R9, 

#9 

wait for 

100 ns; 





instr <= 

X"080A0A"; 

— 

LHI 

R10, 

#10 

wait for 

100 ns; 





instr <= 

X"080B0B"; 

— 

LHI 

Rll, 

#11 

wait for 

100 ns; 





instr <= 

X"080C0C"; 

— 

LHI 

R12, 

#12 
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wait for 

100 ns; 





instr <= 

X"080D0D"; 

-LHI 

R13, 

#13 


wait for 

100 ns; 





instr <= 

X"080EOE"; 

-LHI 

R14, 

#14 


wait for 

100 ns; 





instr <= 

X"080F0F"; 

-LHI 

R15, 

#15 


wait for 

100 ns; 





instr <= 

X"4111FE"; 

-ADDI 

Rl, 

Rl, 

FE 

wait for 

100 ns; 





instr <= 

X"2122FD"; 

- ADDUI R2 

, R2 

, FD 

wait for 

100 ns; 





instr <= 

X " 013 3 4 0 "; 

-ADD 

R3, 

R3, 

R4 

wait for 

100 ns; 





instr <= 

X"4344FF"; 

-SUBI 

R4, 

R4, 

FF 

wait for 

100 ns; 





instr <= 

X " 2 3 5 5 01" ; 

- SUBUI R5 

LO 

PC 

, #1 

wait for 

100 ns; 





instr <= 

X " 0 3 6 6 7 0 " ; 

-SUB 

R6, 

R6, 

R7 

wait for 

100 ns; 





instr <= 

X"2977FF"; 

-AND I 

R7, 

R7, 

FF 

wait for 

100 ns; 





instr <= 

X " 0 9 8 8 8 0 "; 

- AND 

CO 

PC 

CO 

PC 

R9 

wait for 

100 ns; 





instr <= 

X"2A99FF"; 

- ORI 

R9, 

R9, 

FF 

wait for 

100 ns; 





instr <= 

X"0AAAB0"; 

- OR R10, 

R10, 

Rl 1 

wait for 

100 ns; 





instr <= 

X"2BBBF0" ; 

- XORI 

Rl 1 

, Rl1, F0 
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wait for 

100 ns; 




instr <= 

X"0BCCD0"; 

- XOR R12 

, R12, 

wait for 

100 ns; 




instr <= 

X " 4 5 010 0 " ; 

-SW 

R0, 

Rl 

wait for 

100 ns; 




instr <= 

X " 4 512 0 0 " ; 

-SW 

Rl, 

R2 

wait for 

100 ns; 




instr <= 

X " 4 5 2 3 0 0 " ; 

-SW 

R2, 

R3 

wait for 

100 ns; 




instr <= 

X " 4 5 3 4 0 0 " ; 

-SW 

R3, 

R4 

wait for 

100 ns; 




instr <= 

X " 4 5 4 5 0 0 " ; 

-SW 

R4, 

R5 

wait for 

100 ns; 




instr <= 

X " 4 5 5 6 0 0 " ; 

-SW 

LO 

PC 

R6 

wait for 

100 ns; 




instr <= 

X " 4 5 6 7 0 0 " ; 

-SW 

R6, 

R7 

wait for 

100 ns; 




instr <= 

X " 4 5 7 8 0 0 " ; 

- SW 

R7, 

R8 

wait for 

100 ns; 




instr <= 

X " 4 5 8 9 0 0 "; 

- SW 

CO 

Dh 

R9 

wait for 

100 ns; 




instr <= 

X"459A00" ; 

- SW 

R9, 

R10 

wait for 

100 ns; 




instr <= 

X"45AB00" ; 

- SW 

R10, 

Rl 1 

wait for 

100 ns; 




instr <= 

X"45BC00" ; 

- SW 

Rl 1, 

R12 

wait for 

100 ns; 




instr <= 

X"45CD00" ; 

- SW 

R12, 

R13 

wait for 

100 ns; 





R13 
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instr <= 

X " 31110 4 " ; 

-SLLI 

Rl, 

i—1 

Ph 

#4 

wait for 

100 ns; 





instr <= 

X " 112 2 4 0 " ; 

-SLL 

R2, 

R2, 

R4 

wait for 

100 ns; 





instr <= 

X " 3 2 6 3 0 4 " ; 

-SRLI 

R3, 

R6, 

#4 

wait for 

100 ns; 





instr <= 

X " 12 6 4 4 0 " ; 

-SRL 

R4, R6,R4 


wait for 

100 ns; 





instr <= 

X " 3 3 6 5 0 4 " ; 

-SRAI 

R5, 

R6, 

#4 

wait for 

100 ns; 





instr <= 

X " 13 6 6 4 0 " ; 

-SRA 

R6, 

R6, 

R4 

wait for 

100 ns; 





instr <= 

X " 3 8 7 7 01" ; 

-SEQI 

R7, 

R7, 

#1 

wait for 

100 ns; 





instr <= 

X " 3 8 7 8 0 0 " ; 

-SEQI 

R8, 

R7, 

#0 

wait for 

100 ns; 





instr <= 

X"3D7 900"; 

-SNEI 

R9, 

R7, 

#0 

wait for 

100 ns; 





instr <= 

X"3D7A01"; 

-SNEI 

R10 

, R7 

, #1 

wait for 

100 ns; 





instr <= 

X " 1D1B10 " ; 

-SNE 

Rl 1, 

Rl, 

Rl 

wait for 

100 ns; 





instr <= 

X"1D1C20"; 

-SNE 

R12, 

Rl, 

R2 

wait for 

100 ns; 





instr <= 

X"3C7D00"; 

-SLTI 

R13 

, R7 

, #0 

wait for 

100 ns; 





instr <= 

X"3C7E01"; 

-SLTI 

R13 

, R7 

, #0 

wait for 

100 ns; 
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instr <= 

X " 4 5 010 0 " ; 

- SW 

R0, 

Rl 

wait for 

100 ns; 




instr <= 

X " 4 512 0 0 " ; 

- SW 

\ —1 

PC 

R2 

wait for 

100 ns; 




instr <= 

X " 4 5 2 3 0 0 " ; 

- SW 

R2, 

R3 

wait for 

100 ns; 




instr <= 

X " 4 5 3 4 0 0 " ; 

- SW 

R3, 

R4 

wait for 

100 ns; 




instr <= 

X " 4 5 4 5 0 0 " ; 

- SW 

R4, 

R5 

wait for 

100 ns; 




instr <= 

X " 4 5 5 6 0 0 " ; 

- SW 

LO 

PC 

R6 

wait for 

100 ns; 




instr <= 

X " 4 5 6 7 0 0 " ; 

- SW 

R6, 

R7 

wait for 

100 ns; 




instr <= 

X " 4 5 7 8 0 0 " ; 

- SW 

R7, 

R8 

wait for 

100 ns; 




instr <= 

X " 4 5 8 9 0 0 " ; 

- SW 

CO 

PC 

R9 

wait for 

100 ns; 




instr <= 

X"459A00"; 

- SW 

R9, 

R10 

wait for 

100 ns; 




instr <= 

X"45AB00" ; 

- SW 

R10, 

Rl 1 

wait for 

100 ns; 




instr <= 

X"45BC00" ; 

- SW 

Rl 1, 

R12 

wait for 

100 ns; 




instr <= 

X"45CD00" ; 

- SW 

R12, 

R13 

wait for 

100 ns; 




instr <= 

X"45DE00" ; 

- SW 

R13, 

R14 

wait for 

100 ns; 




instr <= 

X " 18 718 0 "; 

- SEQ Rl, 
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R7, 


R8 



wait for 100 ns; 


instr <= X"187290"; -SEQ R2, R7, R9 

wait for 100 ns; 

instr <= X"1C7360"; - SLT R3, R7, R6 

wait for 100 ns; 

instr <= X"1C6470"; -SLT R4, R6, R7 

wait for 100 ns; 

instr <= X"1A6570"; - SGT R5, R6, R7 

wait for 100 ns; 

instr <= X"1A7660"; - SGT R6, R7, R6 

wait for 100 ns; 

instr <= X"5A8701"; - SGTI R8, R7, #1 

wait for 100 ns; 

instr <= X"5A8800"; - SGTI R8, R8, 0 

wait for 100 ns; 

instr <= X"5BB9FF"; - SLEI R9, Rll, FF 

wait for 100 ns; 

instr <= X"5BBA01"; - SLEI R10, Rll, #1 

wait for 100 ns; 

instr <= X"5BBB02"; - SLEI Rll, Rll, #2 

wait for 100 ns; 

instr <= X"1B2C10"; - SLE R12, R2, R1 

wait for 100 ns; 

instr <= X"1B2D40"; - SLE R13, R2, R4 

wait for 100 ns; 

instr <= X"1B1E20"; - SLE R14, Rl, R2 

wait for 100 ns; 

instr <= X"450100"; -SW R0, Rl 


220 



wait for 

100 ns; 




instr <= 

X " 4 512 0 0 " ; 

-SW 

\ —1 

PC 

R2 

wait for 

100 ns; 




instr <= 

X " 4 5 2 3 0 0 " ; 

-SW 

R2, 

R3 

wait for 

100 ns; 




instr <= 

X " 4 5 3 4 0 0 " ; 

-SW 

R3, 

R4 

wait for 

100 ns; 




instr <= 

X " 4 5 4 5 0 0 " ; 

-SW 

R4, 

R5 

wait for 

100 ns; 




instr <= 

X " 4 5 5 6 0 0 " ; 

-SW 

LO 

PC 

R6 

wait for 

100 ns; 




instr <= 

X " 4 5 6 7 0 0 " ; 

-SW 

R6, 

R7 

wait for 

100 ns; 




instr <= 

X " 4 5 7 8 0 0 " ; 

-SW 

R7, 

R8 

wait for 

100 ns; 




instr <= 

X " 4 5 8 9 0 0 "; 

- SW 

CO 

PC 

R9 

wait for 

100 ns; 




instr <= 

X"459A00"; 

- SW 

R9, 

R10 

wait for 

100 ns; 




instr <= 

X"45AB00"; 

- SW 

R10, 

Rl 1 

wait for 

100 ns; 




instr <= 

X"45BC00" ; 

- SW 

Rl 1, 

R12 

wait for 

100 ns; 




instr <= 

X"45CD00" ; 

- SW 

R12, 

R13 

wait for 

100 ns; 




instr <= 

X"45DE00" ; 

- SW 

R13, 

R14 

wait for 

100 ns; 




instr <= 

X " 1911 2 0 " ; 

- SGE Rl, 

Rl, 

wait for 

100 ns; 





R2 
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instr <= X"192210"; - SGE R2, R2, R1 

wait for 100 ns; 

instr <= X"192320"; - SGE R3, R2, R2 

wait for 100 ns; 

instr <= X"595402"; - SGEI R4, R5, #02 

wait for 100 ns; 

instr <= X"5955FF"; - SGEI R5, R5, FF 

wait for 100 ns; 

instr <= X"596500"; -SGEI R6, R5, #0 

wait for 100 ns; 

instr <= X"450100"; -SW R0, R1 

wait for 100 ns; 

instr <= X"451200"; -SW Rl, R2 

wait for 100 ns; 

instr <= X"452300"; -SW R2, R3 

wait for 100 ns; 

instr <= X"453400"; -SW R3, R4 

wait for 100 ns; 

instr <= X"454500"; - SW R4, R5 

wait for 100 ns; 

instr <= X"455600"; - SW R5, R6 

wait for 100 ns; 

instr <= X"C800FF"; - J OxOOFF 

wait for 100 ns; 

instr <= X"000000"; - NOP 

wait for 100 ns; 

instr <= X"000000"; -NOP 

wait for 100 ns; 
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instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr < : 
wait for 
instr <= 
wait for 
instr <= 


X " 0 0 0 0 0 0 " ; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

X " E 8 8 0 0 0 "; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

X"000000"; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 
X"450F00"; 
100 ns; 

X " C12 0 0 F " ; 
100 ns; 

X"000000"; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

: X " 0 0 0 0 0 0 " ; 
100 ns; 

X " C10 0 0 F " ; 
100 ns; 

X"000000"; 


-NOP 


-NOP 


- JAL 0x8000 


-NOP 


-NOP 


-NOP 


-NOP 


- SW R0, R15 


- BEQZ R2, OxOF 


-NOP 


-NOP 


-NOP 


-NOP 


- BEQZ R0, OxOF 


-NOP 
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wait for 100 ns; 


instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 

instr <= X"000000"; 
wait for 100 ns; 
instr <= X"C0000F"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 

instr <= X"000000"; 
wait for 100 ns; 
instr <= X"C0200F"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 

instr <= X"000000"; 
wait for 100 ns; 
instr <= X"48F000"; 


-NOP 


-NOP 


-NOP 


- BNEZ R0, 


-NOP 


-NOP 


-NOP 


-NOP 


- BNEZ R2, 


-NOP 


-NOP 


-NOP 


-NOP 


- JR R15 


OxOF 


OxOF 
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wait for 100 ns; 


instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 

instr <= X"000000"; 
wait for 100 ns; 
instr <= X"68F000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 

instr <= X"000000"; 
wait for 100 ns; 
instr <= X"450F00"; 
wait for 100 ns; 
instr <= X"28FF00"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 


-NOP 


-NOP 


-NOP 


-NOP 


- JALR R15 


-NOP 


-NOP 


-NOP 


-NOP 


- SW R0, R15 


- TRAP FF00 


-NOP 


-NOP 


-NOP 


wait for 100 ns; 
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instr < : 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr <= 
wait for 
instr < : 


= X"000000" 
100 ns; 

X " F 8 0 0 0 0 "; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

X " 0 0 0 0 0 0 " ; 
100 ns; 

= X"000000" 


wait for 100 ns; 
DATA <= X"FFF1"; 
instr <= X"440100"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 
wait for 100 ns; 
instr <= X"000000"; 


-NOP 


-RFE 


-NOP 


-NOP 


-NOP 


-NOP 


-LW R0 (0) , R1 


-NOP 


-NOP 


-NOP 


-NOP 


-NOP 


wait for 100 ns; 

DATA <= "ZZZZZZZZZZZZZZZZ"; 

instr <= X"000000"; -NOP 

wait for 100 ns; 
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instr <= X"450100"; 


SW RO (0) , R1 


wait for 100 ns; 

instr <= X"000000"; -NOP 

wait for 100 ns; 

instr <= X"000000";-NOP 

wait for 100 ns; 

instr <= X"000000"; -NOP 

wait for 100 ns; 

instr <= X"000000"; - NOP 

wait for 100 ns; 

instr <= X"000000";-NOP 

wait for 100 ns; 


end process; 


-- Place stimulus and analysis statements here 


dut : DLX port map ( 

Instr => Instr, 

Addr int => addr int, 
PC => PC, 

Data => data, 

Resetn => resetn. 

Prog Rd => prog rd, 

Rd => rd, 

Wr => wr, 

Stalin => stalln. 
Clock in => clock in 

) ; 

end testbench; 


227 



9 . 


increment, vhd 


LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

USE IEEE.std logic arith.all; 

__ ***** model ***** 

-- external ports 

ENTITY dlx IS PORT ( 

Addr Int : OUT std logic vector(15 downto 0); 
Clock in : IN std logic; 

Data : INOUT std_logic_vector(15 downto 0) ; 
Instr : IN std logic vector(23 downto 0); 

PC : OUT std_logic_vector(15 downto 0) ; 
Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 

Resetn : IN std logic; 

Stalin : IN std logic; 

Wr : OUT std logic 

) ; 


END dlx; 

-- internal structure 

ARCHITECTURE structural OF dlx IS 

-- COMPONENTS 

COMPONENT core 
PORT ( 

Addr Int : OUT std logic vector(15 downto 0); 
Clock in : IN std logic; 

Input Data : IN std logic vector(15 downto 0) ; 
Output_Data : Out std_logic_vector(15 downto 0) 
Instr : IN std logic vector(23 downto 0); 

PC : OUT std_logic_vector(15 downto 0) ; 

Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 

Resetn : IN std logic; 

Stalin : IN std logic; 

Wr : OUT std logic 

) ; 

END COMPONENT; 


COMPONENT IO_Pads 
PORT ( 

Pads : INOUT std_logic_vector (15 downto 0); 
In_Data : OUT std_logic_vector (15 downto 0); 
Out_Data : IN std_logic_vector (15 downto 0); 
Output En n : IN std logic 

) ; 

END COMPONENT; 


-- SIGNALS 
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signal Input data : std logic vector(15 downto 0); 
signal Output_data : std_logic_vector(15 downto 0); 
signal wr int : std logic; 

-- INSTANCES 
BEGIN 

wr <= wr int; 

Corel : core PORT MAP( 

Addr Int => Addr Int, 

Clock in => Clock In, 

Input Data => Input data, 

Output_Data => Output_data, 

Instr => Instr, 

PC => PC, 

Prog Rd => Prog Rd, 

Rd => Rd, 

Resetn => Resetn, 

Stalin => stalln, 

Wr => Wr_int 

) ; 

IO_Pads_l : IO_Pads PORT MAP( 

Pads => Data, 

In Data => Input Data, 

Out_Data => Output_Data, 

Output En n => wr int 

) ; 

END structural; 

10. IOPads.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

-*** IO__Pads Model *** 

- external ports 

Entity 10 Pads is PORT ( 

Pads : INOUT std_logic_vector (15 downto 0); 

In_Data : Out std_logic_vector (15 downto 0) ; 

Out_Data : In std_logic_vector (15 downto 0); 

Output En n ; IN std logic 

) ; 

END IO_Pads; 

Architecture Behavior of 10 Pads is 
Begin 

--In Data <= Pads; 

Pads <= Out Data when Output En n = 'O' else (Pads'range => 

' Z ' ) ; 

In Data <= Pads; 
end Behavior; 
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11. logbarrel.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** log_barrel model ***** 

-- external ports 

ENTITY logjoarrel IS PORT ( 

ar or log : IN std logic; 

In word : IN std logic vector(15 downto 0); 

1 or r : IN std logic; 

Out__word : Out std_logic_vector (15 downto 0) ; 
Shift: IN std logic vector(3 downto 0) 


END log barrel; 


0 ) ; 
0 ) ; 
0 ) ; 


— internal structure 
ARCHITECTURE rtl OF logjoarrel IS 

signal sell, sel2, sel3, sel4 : std_logic_vector ( 1 downto 0); 


signal 

bufOb, 

bufOc, 

bufOd : std 

logic vector 

(15 downto 

0) ; 

signal 

bufla. 

buflb. 

buflc, bufld 

: std 

logic 

vector 

(15 

downto 

signal 

buf2a, 

buf2b, 

buf2c, buf2d 

: std 

logic 

vector 

(15 

downto 

signal 

buf3a, 

buf3b, 

buf3c, buf3d 

: std 

logic 

vector 

(15 

downto 


component word mux4 

port (a : in std_logic_vector (15 downto 0) ; 

b : in std_logic_vector (15 downto 0) ; 

c : in std_logic_vector (15 downto 0) ; 

d : in std_logic_vector (15 downto 0) ; 

sel : in std_logic_vector (1 downto 0); 
ouJword : out std_logic_vector (15 downto 0) 

) ; 


end component; 


begin 


sell 

(1) 

<= 

1 _ 

or 

r and 

shift (0) 

r 

sell 

(0) 

<= 

ar 

or 

log 

and 

shift 

(0) ; 

CM 

i—1 

CD 

CO 

(1) 

<= 

1 _ 

or 

r and 

shift (1) 

r 

sel2 

(0) 

<= 

ar 

or 

log 

and 

shift 

(l); 

sel3 

(1) 

<= 

1 _ 

or 

r and 

shift (2) 

r 

sel3 

(0) 

<= 

ar 

or 

log 

and 

shift 

(2) ; 

sel4 

(1) 

<= 

1 _ 

or 

r and 

shift (3) 

r 

sel4 

(0) 

<= 

ar 

or 

log 

and 

shift 

(3) ; 


bufOb <= in word(14 downto 0) & "0"; 

bufOc <= "0" & in word(15 downto 1); 
bufOd <= in word(15) & in word(15 downto 1); 

buflb <= bufla(13 downto 0) & "00"; 
buflc <= "00" & bufla(15 downto 2); 
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bufld 

<= 

bufla(15) 

i & bufla(15) 

& bufla(15 

buf 2b 

<= 

buf2a(11 

downto 0) & 

"0000"; 

buf 2c 

<= 

"0000" & 

buf2a(15 downto 4); 

buf 2d 

<= 

buf2a(15) 

i & buf2a(15) 

& buf2a(15 


downto 4) ; 


buf 3b 
buf 3c 
buf 3d 
buf3a(15) & 


<= buf3a(7 downto 0) & "00000000"; 

<= "00000000" & buf3a(15 downto 8); 
<= buf3a(15) & buf3a(15) & buf3a(15) 
buf3a(15) & buf3a(15) & buf3a(15) & 


rauxl: word mux4 

port map ( 

a => in word, 
b => bufOb, 
c => bufOc, 
d => bufOd, 
sel => sell, 
out word => bufla 
) ; 

mux2: word mux4 

port map ( 

a => bufla, 
b => buflb, 
c => buflc, 
d => bufld, 
sel => sel2, 
out word => buf2a 
) ; 


mux3: word mux4 

port map ( 

a => buf2a, 
b => buf2b, 
c => buf2c, 
d => buf2d, 
sel => sel3, 
out^word => buf3a 
) ; 


mux4: word mux4 

port map ( 

a => buf3a, 

b => buf3b, 

c => buf3c, 

d => buf3d, 

sel => sel4, 

out^word => out_word); 

end rtl; 


downto 2); 


& buf2a(15) & buf2a(15 


& buf3a(15) & 
buf3a(15 downto 8); 
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12. pccontrol.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** pc_control model ***** 

-- external ports 

ENTITY pc_control IS PORT ( 

ALU Out : IN std logic vector(15 downto 0); 
Clock : IN std logic; 

D2 Inc PC : OUT std logic vector(15 downto 0) 
D Link PC : OUT std logic vector(15 downto 0) 
IAR Enable : IN std logic; 

In_PC : OUT std_logic_vector(15 downto 0); 

PC : OUT std_logic_vector(15 downto 0) ; 

PC Sel : IN std logic vector(1 downto 0); 
Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan_Data_Out : OUT std_logic; 

Scan Enable : IN std logic; 

Stalin : IN std logic 

) ; 

END pc control; 

-- internal structure 

ARCHITECTURE structural OF pc_control IS 

-- COMPONENTS 

COMPONENT word reg single 

PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector(15 downto 0); 
Data_out : OUT std_logic_vector(15 downto 0); 
Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END COMPONENT; 

COMPONENT word_mux3 
PORT ( 

A : IN std logic vector(15 downto 0) ; 

B : IN std logic vector(15 downto 0); 

C : IN std_logic_vector(15 downto 0) ; 

Out_word : OUT std_logic_vector(15 downto 0); 
Sel : IN std logic vector(1 downto 0) 

) ; 

END COMPONENT; 

COMPONENT increment 
PORT ( 

Cl : IN std_logic; 

In word : IN std logic vector(15 downto 0); 
Out_word : OUT std_logic_vector(15 downto 0) 

) ; 

END COMPONENT; 
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SIGNALS 


SIGNAL IAR : std logic vector(15 downto 0); 

SIGNAL PC Incr : std logic vector(15 downto 0); 

SIGNAL But In PC : std logic vector(15 downto 0); 
SIGNAL Buf PC : std logic_vector(15 downto 0); 

SIGNAL Buf Scan Data Out : std logic; 

SIGNAL Buf D1 Inc PC : std logic vector(15 downto 0); 
SIGNAL Buf D2 Inc PC : std logic vector(15 downto 0); 
SIGNAL Buf D Link PC : std logic vector(15 downto 0); 
SIGNAL Link PC : std logic vector(15 downto 0); 

SIGNAL Buf Link PC : std logic vector(15 downto 0); 


-- INSTANCES 

BEGIN 

In_PC <= Buf_In_PC; 

PC <= Buf_PC; 

D2_Inc_PC <= Buf_D2_Inc_PC; 

D_Link_PC <= Buf_D_Link_PC; 

Scan_Data_Out <= IAR(15); 

halfword reg single 1 : word reg single PORT MAP( 
Clock => Clock, 

Data In => Buf In PC, 

Data_out => Buf_PC, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Scan Data In, 

Scan Enable => Scan Enable 

) ; 

halfword mux3 1 : word mux3 PORT MAP( 

A => PC Incr, 

B => ALU_Out, 

C => IAR, 

Out word => Buf In PC, 

Sel _ => PC_Sel 

) ; 

halfword increment 1 : increment PORT MAP( 

Cl => '1', 

In word => Buf PC, 

Out word => PC_Incr 

) ; 

halfword reg single 2 : word reg single PORT MAP ( 
Clock => Clock, 

Data In => PC Incr, 

Data out => Buf D1 Inc PC, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Buf PC (15), 

Scan Enable => Scan Enable 

) ; 

halfword reg single 3 : word reg single PORT MAP( 
Clock => Clock, 

Data_In => Buf_Dl_Inc_PC, 

Data out => Buf D2 Inc PC, 

Enable => Stalin, 
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Resetn => Resetn, 

Scan Data_In => Buf D1 Inc PC(15), 

Scan Enable => Scan Enable 

) ; 

halfword increment 2 : increment PORT MAP( 

CI^=> '1', 

In word(0) => '1', 

In word(15 downto 1) => Buf D2 Inc PC(15 downto 1), 
Out word(15 downto 0) => Link PC (15 downto 0) 

) ; 

halfword reg single 4 : word reg single PORT MAP( 

Clock => Clock, 

Data ln(0) => Buf D2_Inc_PC(0) , 

Data In(15 downto 1) => Link PC(15 downto 1), 

Data out => Buf Link PC, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Buf D2 Inc PC(15), 

Scan Enable => Scan Enable 

) ; 

halfword reg single 5 : word reg single PORT MAP( 

Clock => Clock, 

Data In => Buf Link PC, 

Data Out => Buf D Link PC, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Buf Link PC(15), 

Scan Enable => Scan Enable 

) ; 

halfword reg single 6 : word reg single PORT MAP( 

Clock => Clock, 

Data In => Buf D Link PC, 

Data out => IAR, 

Enable => IAR Enable, 

Resetn => Resetn, 

Scan Data In => Buf D Link PC(15), 

Scan Enable => Scan Enable 

) ; 

END structural; 

13. pipeline.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** pipeline model ***** 

-- external ports 
ENTITY pipeline IS PORT ( 

alu_op : OUT std_logic_vector(4 downto 0) ; 

A Mux : OUT std logic vector(1 downto 0); 

B Mux : OUT std logic vector(1 downto 0); 

Clock : IN std logic; 

Data In : IN std logic vector(23 downto 0); 

Dest : OUT std_logic_vector(3 downto 0) ; 

Immed : OUT std logic vector(15 downto 0) ; 

PC_Sel : OUT std_logic_vector(1 downto 0) ; 
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rd_enable : OUT std_logic; 

Reg In Sel : OUT std logic vector(1 downto 0) 
Resetn : IN std logic; 

RSone : OUT std_logic_vector(3 downto 0); 
RStwo : OUT std_logic_vector(3 downto 0); 

Scan Data In : IN std logic; 

Scan Enable : IN std logic; 

Stalin : IN std logic; 
wb enable : OUT std logic; 
scan_out : OUT std_logic; 

IAR Enable : OUT std logic; 
wr enable : OUT std logic; 
zero flag : IN std logic 

) ; 

END pipeline; 

-- internal structure 
ARCHITECTURE rtl OF pipeline IS 

-- COMPONENTS 

COMPONENT twelve bit reg single 
PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector(ll downto 0); 
Data_out : OUT std_logic_vector(11 downto 0); 
Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END COMPONENT; 

COMPONENT twenty four bit reg single 
PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector (23 downto 0); 
Data_out : OUT std_logic_vector(23 downto 0) ; 
Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END COMPONENT; 


-- SIGNALS 

SIGNAL Dec_Instr : std logic vector (23 downto 0); 
SIGNAL Ex Instr : std logic vector (23 downto 0); 
SIGNAL Mem Instr : std logic vector (11 downto 0); 
SIGNAL WB Instr : std logic_vector (11 downto 0); 


-- INSTANCES 
BEGIN 


****** decode pipeline stage 
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PORT MAP( 


twenty bit reg single 1 : twenty four bit reg single 
Clock => Clock, 

Data In => Data In, 

Data out => Dec Instr, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Scan Data In, 

Scan Enable => Scan Enable 

) ; 


process (Dec_Instr) 
begin 

RSone <= Dec Instr(15 downto 12); 

- assign RS2 (check for SW instruction) 

if (Dec Instr(23 downto 16) = X"45") then 
RStwo <= Dec_Instr(ll downto 8) ; 

else RStwo <= Dec Instr(7 downto 4); 
end if; 
end process; 

- ****** execute pipeline stage ********** 

twenty four bit reg single 2 : twenty four bit reg single PORT 

MAP ( 

Clock => Clock, 

Data In => Dec Instr, 

Data_out => Ex Instr, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Dec Instr(23), 

Scan Enable => Scan Enable 

) ; 


Immed <= Ex Instr(15 downto 0); - assign immediate value 

alu op <= Ex Instr(20 downto 16); - assign alu opcodes 

b mux <= Ex Instr(22 downto 21); - assign b mux 



PC Sel 

A 

II 

o 

i— 1 

when 

Ex Instr (23 

downto 

I— 1 

II 

X 

n 

00 

else 

— 

when 

OP J 










"01" 

when 

Ex Instr (23 

downto 

16) = X"E8" 

else 


when 

OP JAL 










"0" 

& zero 

flag when 

Ex Instr(23 downto 

16) = 

X"C1" 

else 

- when 

OP BEQZ 









"0" 

& not(zero flag) 

when Ex 

Instr(23 downto 

16) = 

X"C0" 

else -■ 

--when OP BEQZ 








"10" 

when 

Ex Instr (23 

downto 

16) = X"F8" 

else 

- OP 



"01" 

when 

Ex Instr (23 

downto 

16) = X"28" 

else 

— 

OP TRAP 










"01" 

when 

Ex Instr(23 

downto 

I— 1 

CTi 

II 

>< 

00 

else 

— 

OP JR 










"01" 

when 

Ex Instr(23 

downto 

I— 1 

CTi 

II 

>< 

CTi 

00 

else 

— 

OP JALR 










o 

O 

r 







process (Ex Instr) 
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begin 


case Ex Instr(23 downto 16) is 


when 

X 

o 

CO 

" => 

-when 

OP 

J 

A 

Mux 

\ —1 
\ —1 

II 

V 




when X"E8" => 

-when 

OP 

JAL 

A 

Mux 

1—1 
\ —1 

II 

V 




when 

X"C1 

" => 

-when 

OP 

BEQZ 

A 

Mux 

\—1 
o 

II 

V 




when 

X"C0 

" => 

-when 

OP 

BNEZ 

A 

Mux 

\—i 

o 

II 

V 




when 

X"08 

" => 

-when 

OP 

LHI 

A 

Mux 

o 

i —i 

II 

V 




when 

X"F8 

" => 

-when 

OP 

RFE 

A 

Mux 

A 

II 

o 

o 




when 

X"2 8 

" => 

-when 

OP 

TRAP 

A 

Mux 

i—i 
\—i 

II 

V 




when 

X"4 8 

" => 

-when 

OP 

JR 

A 

Mux 

A 

II 

o 

o 




when 

>< 

CTi 

CO 

" => 

-when 

OP 

JALR 

A 

Mux 

A 

II 

o 

o 




when 

others => 

- OTHERS 


A 

Mux 

A 

II 

o 

o 





end case; 
end process; 

- ***** memory stage of pipeline ******* - 

twelve bit reg single 1 : twelve bit reg single PORT MAP( 
Clock => Clock, 

Data In(11 downto 4) => Ex Instr(23 downto 16), 

Data In(3 downto 0) => Ex Instr(ll downto 8), 

Data out => Mem Instr, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Ex Instr(23), 

Scan Enable => Scan Enable 


process (Mem Instr) 
begin 

case Mem Instr(11 downto 
when X"45" => 

rd_enable <= 'O'; 
wr enable <= '1'; 
when X"44" => 

rd enable <= '1' ; 
wr enable <= 'O'; 
when others => 

rd_enable <= 'O'; 
wr enable <= 'O'; 
end case; 
end process; 


4) is 


OP SW (write) 
OP LW (read) 


- ******** write back stage ******** 

twelve bit reg single 2 : twelve bit reg single PORT MAP( 
Clock => Clock, 
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Data In => Mem Instr, 

Data out => WB Instr, 

Enable => Stalin, 

Resetn => Resetn, 

Scan Data In => Mem Instr(ll), 
Scan Enable => Scan Enable 


scan out <= WB Instr(11); 
process (WB Instr) 
begin 


- check for Jump and Link Instructions to set Reg In Sel(O) 

0 

if (WB Instr(11 downto 4) = X"E8" or WB Instr(11 downto 4) = 
X"68") then 

Reg In Sel(l) <= '1'; 

Dest <= "1111"; 
else Reg_In_Sel(1) <= 'O'; 

Dest <= WB Instr(3 downto 0); 
end if; 

- check for TRAP to set IAR Enable = 1 

if (WB Instr(11 downto 4) = X"28") then 
IAR_Enable <= '1'; 
else IAR Enable <= 'O'; 
end if; 

- check for LW to set Reg In Sel(l) = 1 

if (WB Instr(11 downto 4) = X"44" ) then 
Reg_In_Sel(0) <= '1'; 
else Reg_In_Sel(0) <= 'O'; 
end if; 


- set write back enable 

case WB Instr (11 downto 4) is 
when X"C8" => 

WB Enable <= 'O'; 

when X"C1" => - 

WB Enable <= 'O'; 

when X"C0" => - 

WB Enable <= 'O'; 

when X"45" => - 

WB Enable <= 'O'; 

when X"F8" => - 

WB Enable <= 'O'; 

when X"28" => - 

WB Enable <= 'O'; 

when X"48" => - 

WB Enable <= 'O'; 

when X"00" => - 

WB Enable <= 'O'; 
when others => 

WB_Enable <= '1'; 
end case; 
end process; 

END rtl; 


- when OP_J 

when OP BEQZ 
when OP BNEZ 
when OP SW 
when OP RFE 
when OP_TRAP 
when OP JR 
when OP NOP 
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14. regfile.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

-******* re gfii e model *********** 

- external ports 

ENTITY regfiie IS PORT ( 

A : OUT std_logic_vector(15 downto 0); 

B : OUT std_logic_vector(15 downto 0); 
clock : IN std logic; 

Data-In : IN std logic vector(15 downto 0); 

Dest : IN std_logic_vector(3 downto 0); 
stalln : IN std logic; 

RSone : IN std logic vector(3 downto 0); 

RStwo : IN std logic vector(3 downto 0); 
scan data in : IN std logic; 
scan enable : IN std logic; 

Resetn : IN std logic; 
wb enable : IN std logic 

) ; 

END regfiie; 

- internal structure 

ARCHITECTURE structural OF regfiie is 

- COMPONENTS 

COMPONENT Dest_Decoder 
PORT ( 

Dest : IN std_logic_vector(3 downto 0) ; 

Enable : OUT std logic vector(15 downto 1); 
wb enable : IN std logic 

) ; 

END COMPONENT; 

COMPONENT word reg single 
PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector (15 downto 0); 
Data_out : OUT std_logic_vector (15 downto 0) 
enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END COMPONENT; 


COMPONENT word_muxl6 
PORT ( 


In WordO 
In Wordl 
In Word2 
In Word3 
In Word4 
In Word5 
In Word6 
In Word7 


IN std 
IN std 
IN std 
IN std 
IN std 
IN std 
IN std 
IN std’ 


logic_vector (15 
logic_vector(15 
logic_vector (15 
logic_vector(15 
logic_vector(15 
logic_vector (15 
logic_vector(15 
logic_vector (15 


downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

r 

downto 

0) 

f 

downto 

0) 

r 

downto 

0) 

r 
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In Word8 : IN std logic vector(15 downto 0); 

In Word9 : IN std logic vector(15 downto 0) ; 

In WordlO : IN std logic vector(15 downto 0); 

In Wordll : IN std logic vector(15 downto 0); 

In Wordl2 : IN std logic vector(15 downto 0); 

In Wordl3 : IN std logic vector(15 downto 0); 

In Wordl4 : IN std logic vector(15 downto 0); 

In Wordl5 : IN std logic vector(15 downto 0); 

Out_word : Out std_logic_vector(15 downto 0); 
Sel : IN std_logic_vector(3 downto 0) 

) ; 

END component; 

- signals 

signal Enable : std logic vector(15 downto 1); 
signal Regl Data : std logic vector(15 downto 0); 

signal Reg2_Data : std logic vector(15 downto 0); 

signal Reg3_Data : std logic vector(15 downto 0); 

signal Reg4 Data : std logic vector(15 downto 0); 

signal Reg5 Data : std logic vector(15 downto 0); 

signal Reg6 Data : std logic vector(15 downto 0); 

signal Reg7 Data : std logic vector(15 downto 0); 

signal Reg8 Data : std logic vector(15 downto 0); 

signal Reg9 Data : std logic vector(15 downto 0); 

signal ReglO Data : std logic vector(15 downto 0); 

signal Regll Data : std logic vector(15 downto 0); 

signal Regl2 Data : std logic_vector(15 downto 0); 

signal Regl3 Data : std logic_vector(15 downto 0); 

signal Regl4 Data : std logic vector(15 downto 0); 

signal Regl5 Data : std logic vector(15 downto 0); 

signal RegA Data : std logic vector(15 downto 0); 

signal MuxA Data : std logic vector(15 downto 0); 

signal MuxB Data : std logic vector(15 downto 0); 

signal zero word : std logic vector(15 downto 0); 


begin 

zero_word <= "0000000000000000"; 

- port maps 

Dest Decoderl : Dest Decoder PORT MAP ( 
Dest=> Dest, 

Enable => Enable, 
wb enable => wb enable 

) ; 

word regl : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Regl_Data, 

Enable => Enable(1), 

Resetn => Resetn, 

Scan Data_In => Scan Data In, 

Scan Enable => Scan Enable 

) ; 
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word reg2 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data out => Reg2 Data, 

Enable => Enable(2), 

Resetn => Resetn, 

Scan Data In => Regl Data(15), 
Scan Enable => Scan Enable 

) ; 

word reg3 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Reg3_Data, 

Enable => Enable(3), 

Resetn => Resetn, 

Scan Data In => Reg2 Data(15), 
Scan Enable => Scan Enable 

) ; 

word reg4 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Reg4_Data, 

Enable => Enable(4), 

Resetn => Resetn, 

Scan Data In => Reg3 Data(15), 
Scan Enable => Scan Enable 

) ; 

word reg5 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Reg5_Data, 

Enable => Enable(5), 

Resetn => Resetn, 

Scan Data In => Reg4 Data(15), 
Scan Enable => Scan Enable 

) ; 

word reg6 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Reg6_Data, 

Enable => Enable(6), 

Resetn => Resetn, 

Scan Data_In => Reg5 Data(15), 
Scan Enable => Scan Enable 

) ; 

word reg7 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Reg7_Data, 

Enable => Enable(7), 

Resetn => Resetn, 

Scan Data In => Reg6 Data(15), 
Scan Enable => Scan Enable 

) ; 

word reg8 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 
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Data_out => Reg8_Data, 

Enable => Enable(8), 

Resetn => Resetn, 

Scan Data In => Reg7 Data(15), 
Scan Enable => Scan Enable 

) ; 

word reg9 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Reg9_Data, 

Enable => Enable(9), 

Resetn => Resetn, 

Scan Data_In => Reg8 Data(15), 
Scan Enable => Scan Enable 

) ; 

word reglO : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => ReglO_Data, 

Enable => Enable(10), 

Resetn => Resetn, 

Scan_Data_In => Reg9_Data(15), 
Scan Enable => Scan Enable 

) ; 

word regll : word reg_single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data out => Regll Data, 

Enable => Enable(11), 

Resetn => Resetn, 

Scan Data_In => ReglO Data(15), 
Scan Enable => Scan Enable 

) ; 

word regl2 : word reg_single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data out => Regl2 Data, 

Enable => Enable(12), 

Resetn => Resetn, 

Scan Data In => Regll Data(15), 
Scan Enable => Scan Enable 

) ; 

word regl3 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data_out => Regl3__Data, 

Enable => Enable (13), 

Resetn => Resetn, 

Scan Data_In => Regl2 Data(15), 
Scan Enable => Scan Enable 

) ; 

word regl4 : word reg single PORT MAP ( 
Clock => clock. 

Data In => Data In, 

Data out => Regl4 Data, 

Enable => Enable(14), 

Resetn => Resetn, 
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Scan Data In => Regl3 Data(15), 
Scan Enable => Scan Enable 

) ; 

word regl5 : word reg_single PORT MAP ( 
Clock => clock. 

Data In => Data_In, 

Data_out => Regl5_Data, 

Enable => Enable(15), 

Resetn => Resetn, 

Scan Data In => Regl4 Data(15), 
Scan Enable => Scan Enable 

) ; 

word regA : word reg single PORT MAP ( 
Clock => clock. 

Data In => MuxA Data, 

Data out => RegA Data, 

Enable => stalln, 

Resetn => Resetn, 

Scan Data In => Regl5 Data(15), 
Scan Enable => Scan Enable 


A <= RegA Data; 

word regB : word reg single PORT MAP ( 
Clock => clock. 

Data In => MuxB Data, 

Data_out => B, 

Enable => stalln, 

Resetn => Resetn, 

Scan Data In => RegA Data(15), 
Scan Enable => Scan Enable 

) ; 

MuxA : word muxl6 PORT MAP ( 

In WordO => zero word. 


In Wordl 

=> 

Regl Data, 

In Word2 

=> 

Reg2 Data, 

In Word3 

=> 

Reg3 Data, 

In Word4 

=> 

Reg4 Data, 

In Word5 

=> 

Reg5 Data, 

In Word6 

=> 

Reg6 Data, 

In Word7 

=> 

Reg7 Data, 

In Word8 

=> 

Reg8 Data, 

In Word9 

=> 

Reg9 Data, 

In WordlO 

=> 

ReglO Data, 

In Wordll 

=> 

Regll Data, 

In Wordl2 

=> 

Regl2 Data, 

In Wordl3 

=> 

Regl3 Data, 

In Wordl4 

=> 

Regl4 Data, 

In Wordl5 

=> 

Regl5 Data, 

Out word => 

Sel => RSone 

MuxA Data, 


) ; 

MuxB : word muxl6 PORT MAP ( 

In WordO => zero word. 

In Wordl => Regl Data, 
In Word2 => Reg2 Data, 
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In Word3 

=> 

Reg3 Data, 

In Word4 

=> 

Reg4 Data, 

In Word5 

=> 

Reg5 Data, 

In Word6 

=> 

Reg6 Data, 

In Word7 

=> 

Reg7 Data, 

In Word8 

=> 

Reg8 Data, 

In Word9 

=> 

Reg9 Data, 

In WordlO 

=> 

ReglO Data, 

In Wordll 

=> 

Regll Data, 

In Wordl2 

=> 

Regl2 Data, 

In Wordl3 

=> 

Regl3 Data, 

In Wordl4 

=> 

Regl4 Data, 

In Wordl5 

=> 

Regl5 Data, 

Out word => 

Sel => RStwo 

MuxB Data, 


) ; 

END structural; 


15. rwcontrol.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** rw_control model ***** 

-- external ports 

ENTITY rw_control IS PORT ( 

Clock : IN std logic; 

Prog_Rd : OUT std_logic; 

Rd : OUT std_logic; 
rd enable : IN std logic; 
resetn : IN std logic; 
stalln : IN std logic; 

Wr : OUT std_logic; 
wr enable : IN std logic 

) ; 

END rw control; 

-- internal structure 
ARCHITECTURE rtl OF rw_control IS 

-- SIGNALS 

SIGNAL clockn : std logic; - inverted clock 

BEGIN 

clockn <= not(Clock) ; 

Wr <= not (clockn and wr enable); 

Rd <= not (clockn and rd enable); 

Prog Rd <= not (clockn and resetn and stalln) 
end rtl; 
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16. scanreg.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** SC an_reg model ***** 

-- external ports 

ENTITY scan_reg IS PORT ( 
elk : IN std logic; 
data in : IN std logic; 
data_out : OUT std_logic; 
enable : IN std logic; 
resetn : IN std logic; 
scan data in : IN std logic; 
scan enable : IN std logic 

) ; 

END scan reg; 

-- internal structure 

ARCHITECTURE rtl OF scan_reg IS 

-- INSTANCES 
BEGIN 

process (elk, resetn) 
begin 

if (resetn = '0') then 
data_out <= 'O'; 

elsif (elk = '1' and elk'event) then 
if (scan enable = '1') then 
data_out <= scan_data_in; 
elsif (enable = '1') then 
data_out <= data_in; 
end if; 
end if; 
end process; 

END rtl; 

17. twelvebitregsingle.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

__ ***** twelve bit reg single model ***** 

-- external ports 

ENTITY twelve_bit_reg_single IS PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector(11 downto 0); 
Data_out : OUT std_logic_vector(11 downto 0) ; 
Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In ; IN std logic; 

Scan Enable : IN std logic 
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) ; 

END twelve bit reg single; 

-- internal structure 

ARCHITECTURE structural OF twelve bit reg single IS 

-- COMPONENTS 
COMPONENT scan reg 
PORT ( 

elk : IN std logic; 
data in : IN std logic; 
data_out : OUT std_logic; 
enable : IN std logic; 
resetn : IN std logic; 
scan data in : IN std logic; 
scan enable : IN std logic 

) ; 

END COMPONENT; 

-- SIGNALS 

signal buf_data_out : std_logic_vector (10 downto 0) 


-- INSTANCES 
BEGIN 


Data 

out(0) 

<= 

buf 

data 

out (0) ; 

Data 

out(1) 

<= 

buf 

data 

out (1) ; 

Data 

out(2) 

<= 

buf 

data 

out (2); 

Data 

out(3) 

<= 

buf 

data 

out (3) ; 

Data 

out(4) 

<= 

buf 

data 

out (4) ; 

Data 

out(5) 

<= 

buf 

data 

out (5) ; 

Data 

out(6) 

<= 

buf 

data 

out (6); 

Data 

out(7) 

<= 

buf 

data 

out (7) ; 

Data 

out(8) 

<= 

buf 

data 

out (8) ; 

Data 

out(9) 

<= 

buf 

data 

out (9) ; 


Data_out(10) <= buf_data_out(10) ; 

scan reg 1 : scan reg PORT MAP( 
elk => Clock, 
data__in => Data_In(l), 
data_out => buf_data_out(1) , 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(0), 
scan enable => Scan Enable 

) ; 

scan reg_2 : scan reg PORT MAP( 
elk => Clock, 
data in => Data In(2), 
data_out => buf_data_out(2) , 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(1), 
scan enable => Scan Enable 

) ; 

scan reg_3 : scan reg PORT MAP( 
elk => Clock, 
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data_in => Data_In(3), 
data_out => buf_data_out(3), 
enable => Enable, 
resetn => Resetn, 

scan_data__in => buf_data_out(2), 
scan enable => Scan Enable 

) ; 

scan reg 4 : scan reg PORT MAP( 
elk => Clock, 
data-in => Data_In(4), 
data_out => buf_data_out(4), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(3) , 
scan enable => Scan Enable 

) ; 

scan reg 5 : scan^reg PORT MAP( 
elk => Clock, 
data_in => Data_ln(0), 
data_out => buf_data_out(0), 
enable => Enable, 
resetn => Resetn, 
scan data-in => Scan Data In, 
scan enable => Scan Enable 

) ; 

scan reg 6 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(5), 
data_out => buf_data_out(5), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(4) , 
scan enable => Scan Enable 

) ; 

scan reg 7 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(6), 
data_out => buf_data_out(6), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(5) , 
scan enable => Scan Enable 

) ; 

scan reg 8 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(7), 
data_out => buf_data_out(7), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(6) , 
scan enable => Scan Enable 

) ; 

scan reg 9 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(8), 
data_out => buf_data_out(8), 
enable => Enable, 
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resetn => Resetn, 

scan_data__in => buf_data_out (7) , 
scan enable => Scan Enable 

) ; 

scan reg 10 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(9), 
data_out => buf_data_out(9) , 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(8), 
scan enable => Scan Enable 

) ; 

scan reg 11 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(10), 
data_out => buf_data_out(10), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(9), 
scan enable => Scan Enable 

) ; 


scan reg 12 : scan reg PORT MAP( 
elk => Clock, 
data in => Data In(11), 
data_out => Data_out(11), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => buf_data_out(10), 
scan enable => Scan Enable 

) ; 

END structural; 


18. twentyfourbitregsingle.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

-- ***** twenty four bit reg single model ***** 

-- external ports 

ENTITY twenty_four_bit_reg_single IS PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector (23 downto 0); 
Data_out : OUT std_logic_vector (23 downto 0); 

Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END twenty four bit reg single; 

-- internal structure 

ARCHITECTURE structural OF twenty four bit reg single IS 
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COMPONENTS 


Component twelve bit reg single 
PORT ( 

Clock : IN std logic; 

Data In ; IN std logic vector(11 downto 0); 
Data_out : OUT std_logic_vector(11 downto 0) ; 
Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END Component; 


-- SIGNALS 

SIGNAL But Data outll : std logic; 


-- INSTANCES 
BEGIN 

Data_out(ll) <= Buf_Data_outl1; 

twelve bit reg singlel : twelve bit reg single PORT MAP( 
Clock => Clock, 

Data In => Data In(11 downto 0), 

Data_Out(10 downto 0) => Data_Out(10 downto 0), 
Data_Out(ll) => Buf_Data_outl1, 

Enable => Enable, 

Resetn => Resetn, 

Scan Data In => Scan Data In, 

Scan Enable => Scan Enable 

) ; 

twelve bit reg single2 : twelve bit reg single PORT MAP( 
Clock => Clock, 

Data In => Data In(23 downto 12), 

Data_Out => Data_Out(23 downto 12), 

Enable => Enable, 

Resetn => Resetn, 

Scan Data In => But Data_outll, 

Scan Enable => Scan Enable 

) ; 

END structural; 

19. word_muxl6.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 


— ***** word_muxl6 model ***** 

-- external ports 

ENTITY word_muxl6 IS PORT ( 

In WordO : IN std logic vector(15 downto 0) ; 
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In Wordl : IN std logic_vector(15 downto 0); 

In Word2 : IN std logic_vector(15 downto 0); 

In Word3 : IN std logic_vector(15 downto 0); 

In Word4 : IN std logic vector(15 downto 0); 

In Word5 : IN std logic vector(15 downto 0); 

In Word6 : IN std logic vector(15 downto 0); 

In Word7 : IN std logic_vector(15 downto 0); 

In Word8 : IN std logic_vector(15 downto 0); 

In Word9 : IN std logic vector(15 downto 0); 

In WordlO : IN std logic vector(15 downto 0); 

In Wordll : IN std logic vector(15 downto 0); 

In Wordl2 : IN std logic vector(15 downto 0); 

In Wordl3 : IN std logic vector(15 downto 0); 

In Wordl4 : IN std logic vector(15 downto 0); 

In Wordl5 : IN std logic vector(15 downto 0); 

Out_word : Out std_logic_vector(15 downto 0); 
Sel : IN std_logic_vector(3 downto 0) 

) ; 

END word muxl6; 

-- internal structure 
ARCHITECTURE rtl OF word muxl6 IS 


BEGIN 

with sel select 

Out word <= In WordO when "0000", 

In Wordl when "0001", 

In Word2 when "0010", 
In Word3 when "0011", 
In Word4 when "0100", 
In Word5 when "0101", 
In Word6 when "0110", 
In Word7 when "0111", 
In Word8 when "1000", 
In Word9 when "1001", 
In WordlO when "1010", 
In Wordll when "1011", 
In Wordl2 when "1100", 
In Wordl3 when "1101", 
In Wordl4 when "1110", 
In Wordl5 when others; 

END rtl; 


20. word_mux3.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** word_mux3 model ***** 

-- external ports 

ENTITY word_mux3 IS PORT ( 

A : IN std logic vector(15 downto 0) ; 
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B : IN std logic vector(15 downto 0) ; 

C : IN std_logic_vector(15 downto 0); 
Out_word : Out std_logic_vector(15 downto 0) 
Sel : IN std logic vector(1 downto 0) 

) ; 

END word mux3; 

-- internal structure 
ARCHITECTURE rtl OF word_mux3 IS 
BEGIN 

process (A, B, C, Sel) 
begin 

case sel is 

when "00" => Out word <= A; 
when "01" => Out word <= B; 
when others => Out word <= C; 
end case; 
end process; 

END rtl; 


21. word_mux4.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** word_mux4 model ***** 

-- external ports 

ENTITY word_mux4 IS PORT ( 

A : IN std logic vector(15 downto 0) ; 

B : IN std logic vector(15 downto 0); 

C : IN std_logic_vector(15 downto 0) ; 

D : IN std logic vector(15 downto 0) ; 
Out_word : Out std_logic_vector(15 downto 0) 
Sel : IN std logic vector(1 downto 0) 

) ; 

END word mux4; 

-- internal structure 
ARCHITECTURE rtl OF word_mux4 IS 
BEGIN 

process (A, B, C, D, Sel) 
begin 

case sel is 


when 

A 

II 

o 

o 

Out 

word 

<= 

A; 

when 

A 

II 

\—i 

o 

Out 

word 

<= 

B; 

when 

"10" => 

Out 

word 

<= 

C; 

when 

others = 

=> Out word 

<= D; 


end case; 
end process; 
END rtl; 
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22 . 


wordregsingle.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

__ ***** worc j re g single model ***** 

-- external ports 

ENTITY word_reg_single IS PORT ( 

Clock : IN std logic; 

Data In : IN std logic vector (15 downto 0); 
Data_out : OUT std_logic_vector (15 downto 0) 
Enable : IN std logic; 

Resetn : IN std logic; 

Scan Data In : IN std logic; 

Scan Enable : IN std logic 

) ; 

END word reg single; 

-- internal structure 

ARCHITECTURE structural OF word reg single IS 


-- COMPONENTS 

COMPONENT scan reg 
PORT ( 

elk : IN std logic; 
data in ; IN std logic; 
data_out : OUT std_logic; 
enable : IN std logic; 
resetn : IN std logic; 
scan data in : IN std logic; 
scan enable : IN std logic 

) ; 

END COMPONENT; 

-- SIGNALS 

SIGNAL Buf Data out : std logic vector(14 downto 0) 


-- INSTANCES 
BEGIN 


Data_out(0) 
Data_out(1) 
Data_out(2) 
Data_out(3) 
Data_out(4) 
Data_out(5) 
Data_out(6) 
Data_out(7) 
Data_out(8) 
Data_out(9) 
Data_out(10) 
Data out (11) 


<= Buf_Data_out(0); 

<= Buf_Data_out(1); 

<= Buf_Data_out(2); 

<= Buf_Data_out(3) ; 

<= Buf_Data_out(4 ) ; 

<= Buf_Data_out(5) ; 

<= Buf_Data_out(6) ; 

<= Buf_Data_out(7) ; 

<= Buf_Data_out(8) ; 

<= Buf_Data_out(9) ; 

<= Buf_Data_out(10) ; 
<= Buf Data out(11); 
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Data_out(12) <= Buf_Data_out(12); 

Data_out(13) <= Buf_Data_out(13); 

Data_out(14) <= Buf_Data_out(14); 

scan reg 1 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(l), 
data_out => Buf_Data_out(1) , 
enable => Enable, 
resetn => Resetn, 
scan_data_in => Buf_Data_out(0), 
scan enable => Scan Enable 

) ; 

scan reg 2 : scan reg PORT MAP( 
elk => Clock, 
data in => Data In(2), 
data_out => Buf_Data_out(2) , 
enable => Enable, 
resetn => Resetn, 
scan_data_in => Buf_Data_out(1), 
scan enable => Scan Enable 

) ; 

scan reg_3 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(3), 
data_out => Buf_Data_out(3) , 
enable => Enable, 
resetn => Resetn, 

scan_data_in => Buf_Data_out(2), 
scan enable => Scan Enable 

) ; 

scan reg 4 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(4), 
data_out => Buf_Data_out(4), 
enable => Enable, 
resetn => Resetn, 
scan_data_in => Buf_Data_out(3), 
scan enable => Scan Enable 

) ; 

scan reg 6 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(5), 
data_out => Buf_Data_out(5), 
enable => Enable, 
resetn => Resetn, 

scan_data__in => Buf_Data_out (4) , 
scan enable => Scan Enable 

) ; 

scan reg 7 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(6), 
data_out => Buf_Data_out(6), 
enable => Enable, 
resetn => Resetn, 
scan_data__in => Buf_Data_out (5) , 
scan enable => Scan Enable 
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) ; 

scan reg 8 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(7), 
data_out => Buf_Data_out(7), 
enable => Enable, 
resetn => Resetn, 
scan_data_in => Buf_Data_out(6) , 
scan enable => Scan Enable 

) ; 

scan reg 9 : scan reg PORT MAP( 
elk => Clock, 
data__in => Data_In(8), 
data_out => Buf_Data_out(8), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => Buf_Data_out(7), 
scan enable => Scan Enable 

) ; 

scan reg 10 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(9), 
data_out => Buf_Data_out(9), 
enable => Enable, 
resetn => Resetn, 
scan_data_in => Buf_Data_out(8), 
scan enable => Scan Enable 

) ; 

scan reg 11 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(10), 
data_out => Buf_Data_out(10), 
enable => Enable, 
resetn => Resetn, 
scan_data_in => Buf_Data_out(9), 
scan enable => Scan Enable 

) ; 

scan reg 12 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data In(11), 
data_out => Buf_Data_out(11), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => Buf_Data_out(10), 
scan enable => Scan Enable 

) ; 

scan reg 13 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data In(12), 
data_out => Buf_Data_out(12) , 
enable => Enable, 
resetn => Resetn, 

scan_data__in => Buf_Data_out (11) , 
scan enable => Scan Enable 

) ; 

scan reg 14 : scan reg PORT MAP( 
elk => Clock, 
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data_in => Data_In(13), 
data_out => Buf_Data_out(13), 
enable => Enable, 
resetn => Resetn, 

scan data in => Buf Data out (12), 
scan enable => Scan Enable 

) ; 

scan reg 15 : scan reg PORT MAP( 
elk => Clock, 
data in => Data In(14), 
data_out => Buf_Data_out(14), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => Buf_Data_out(13), 
scan enable => Scan Enable 

) ; 

scan reg 16 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_In(15), 
data_out => Data_out(15), 
enable => Enable, 
resetn => Resetn, 

scan_data_in => Buf_Data_out(14), 
scan enable => Scan Enable 

) ; 

scan reg 5 : scan reg PORT MAP( 
elk => Clock, 
data_in => Data_ln(0), 
data_out => Buf_Data_out(0), 
enable => Enable, 
resetn => Resetn, 
scan data in => Scan Data In, 
scan enable => Scan Enable 

) ; 

END structural; 

23. wordset.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** WO rd_set model ***** 

-- external ports 

ENTITY word_set IS PORT ( 

In word : IN std logic vector (15 downto 0) 
set_op : IN std_logic_vector (2 downto 0); 
set_out : OUT std_logic 

) ; 

END word set; 

-- internal structure 

ARCHITECTURE rtl OF word_set IS 

component zero test 

PORT ( 

In word : in std logic vector(15 downto 0); 
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zero flag : OUT std logic 

) ; 

END component; 

signal zero flag : std logic; 
begin 

process (In_word, set_op, zero_flag) 
begin 

case settop is 


when 

"000" 

=> 

set 

out 

<= 

zero flag; 



when 

"001" 

=> 

set 

out 

<= 

(not(In word(15) 

) or 

zero flag); 

when 

"010" 

=> 

set 

out 

<= 

not(In word(15)) 

and 

not(zero flag) 

when 

"Oil" 

=> 

set 

out 

<= 

(In word(15) or 

zero 

flag); 

when 

"100" 

=> 

set 

out 

<= 

In word(15); 




when others => set out <= not(zero flag); 
end case; 
end process; 

zero_testl : zero_test port map ( 

In word => In word, 
zero flag => zero flag 

) ; 


END rtl; 

24. zerotest.vhd 

LIBRARY IEEE; 

USE IEEE.std_logic_1164.all; 

— ***** zero_test model ***** 

-- external ports 

ENTITY zero^test IS PORT ( 

In word : in std_logic vector(15 downto 0); 
zero flag : OUT std logic 

) ; 

END zero test; 

-- internal structure 
ARCHITECTURE rtl OF zero_test IS 
begin 

process (In_word) 
begin 

if (In word = "0000000000000000") then 
zero flag <= ' 1' ; 
else zero flag <= 'O'; 
end if; 
end process; 

END rtl; 
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APPENDIX E: GLOSSARY 


BGA 

Ball Grid Array 

CFTP 

Configurable Fault-Tolerant Processor 

COTS 

Commercial Off the Shelf 

Coregen 

CORE generator 

CPLD 

Complex Programmable Logic Device 

ESSD 

Error Syndrome Storage Device 

FPGA 

Field Programmable Gate Array 

HDL 

Hardware Description Language 

IAR 

Interrupt Address Register 

ISR 

Interrupt Service Routine 

LEO 

Low-Earth Orbit 

Mem 

Memory 

NPS 

Naval Postgraduate School 

Opcode 

Operation code 

RADHARD 

Radiation Hardened 

RAM 

Ramdom-Access Memory 

RFE 

Return From Exception 

RISC 

Reduced Instruction Set Computer 

ROM 

Read-Only Memory 

SEB 

Single Event Burnout 

SEE 

Single Event Effects 

SEL 

Single Event Latchup 
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SEP 

Single Event Phenomenon 

SERB 

Space Experiment Review Board 

SEU 

Single Event Upset 

soc 

System On a Chip 

SPLD 

Sequential (or Simple) Programmable Logic Device 

STP 

Space Test Program 

TMR 

Triple Modular Redundancy 

VHSIC 

Very High Speed Integrated Circuit 

VDHL 

VHSIC Hardware Description Language 

WB 

Write Back 
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